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Executive  Summary 


Background. 


This  report  and  two  companion  reports  have  been 
derived  from  the  chapter  on  detection  theory  prepared 
for  the  High  Gain  Initiative  (HGI)  report.  The  two 
companion  reports,  "Detection  Processing  for 
Undersea  Surveillance"  (Confidential)  and  "Gaussian 
Mixture  Models  for  Acoustic  Interference"  have  a 
undersea  surveillance  focus,  while  this  report  applies 
to  both  communications  and  undersea  surveillance. 
"Detection  Processing  for  Undersea  Surveillance" 
summarizes  the  status  of  detection  processing  for 
undersea  surveillance  at  the  time  of  the  initiation  of  the 
HGI  program  and  deals  primarily  with  adaptive 
filtering.  It  summarizes  the  cumulation  of  many  years 
of  effort  in  the  field  by  J.  Zeidler  and  others. 

"Gaussian  Mixture  Models  for  Acoustic  Interference" 
summarizes  the  results  of  simulation  and  the  modeling 
of  actual  data  collected  during  a  HGI  experiment  by 
Gaussian  mixture  models.  To  a  large  extent,  it 
summarizes  work  accomplished  by  D.  Stein.  This 
report  is  the  cumulation  of  work  by  J.  Bond,  S.  Hui,  D. 
Stein,  and  others  on  adaptive  locally  optimum 
processing  for  interference  suppression  and  of  V. 
Broman  on  target  tracking. 

Initial  work  related  to  the  work  presented  in  this  report 
by  one  of  the  authors  (J.  Bond)  was  on  the  Non-linear 
Adaptive  Processor  (NONAP)  program  and  was 
funded  by  SPAWAR  153  beginning  in  1986.  A  parallel 
program  seeking  alternatives  to  NONAP  called  the 
Adaptive  Locally  Optimum  Detection  (ALOD)  program 
was  also  funded  by  SPAWAR  beginning  in  1989. 
Theoretical  work  by  Per  Kulstam  (Paircom  Inc.)  and 
Hank  Schmidt  (Technology  Services  Incorporated  )  on 
the  NONAP  and  ALOD  programs  also  provided 
valuable  insights  into  the  work  on  adaptive  locally 
optimum  processing  presented  in  this  report.  The  first 
results  expounded  in  this  report  were  obtained  by  J. 
Bond  and  T.  Schlosser  during  1988  and  1989  under 
NRaD  Independent  Enginneering  Development  (lED) 
funds.  All  of  this  early  work  focused  on  the  use  of 
adaptive  locally  optimum  processing  for  interference 
suppression  from  Very  Low  Frequency/Low  Frequency 
bandspread  communication  signals  used  for 
submarine  communications.  The  theory  was 
extended  to  other  frequency  bands  and  other 
waveforms  through  funding  by  the  Communications 


and  Networking  NRaD  6.2  Block  managed  by  Reeve 
Peterson  (NRaD)  beginning  in  1992  and  to  undersea 
surveillance  through  the  funding  of  the  High  Gain 
Initiative  (HGI)  6.2  program  coordinated  by  Robert 
Hearn  (NRaD).  Adaptive  locally  optimum  processing 
has  a  long  academic  history  beginning  with  the  work  of 
David  Middleton  on  Statistical  Communication  Theory 
prior  to  1960  and  this  history  and  other  related  work 
outside  of  that  described  above  is  discussed  in  the 
body  of  this  report. 

Introduction. 


Adaptive  locally  optimum  processing  applies  when  the 
signal  is  dominated  by  the  interference  and  is  effective 
when  the  non-Gaussian  component  of  the  interference 
dominates  the  Gaussian  component.  We  develop  the 
theory  for  discrete  samples  of  the  baseband 
representation  of  a  communication  signal  (or  multiple 
baseband  representations  of  wideband  signals)  or  the 
analytical  representation  of  an  acoustic  signal. 

Practical  algorithms  have  been  derived  under  the 
assumptions  that  the  interference  component  of  the 
discrete  samples  is  independent  from  sample-to- 
sample  and  the  signal  has  known  phase  structure  or 
the  signal  has  unknown  phase  structure.  For  the 
former  case,  the  algorithms  involve  the  calculation  of 
first  derivatives  of  the  probability  density  functions  of 
amplitudes  and  phase  (or  symmetric-phase 
differences)  and  for  the  latter  case,  the  algorithms 
involve  the  calculation  of  second  derivatives  of  the 
probability  density  functions  of  amplitudes  and  phase 
(or  symmetric-phase  differences). 

We  have  identified  three  approaches  to  the 
development  of  practical  adaptive  locally  optimum 
processing:  (a)  model  the  interference  statistics  by  a 
parametrized  family  of  probability  density  functions 
and  estimate  the  parameters  in  real-time  from  the  data 
and  calculate  the  corresponding  adaptive  locally 
optimum  transformation  of  the  data,  (b)  model  the 
interference  statistics  as  above  and  identify  a  family  of 
candidate  models  for  the  interference,  compare  the 
real-time  statistics  with  each  of  the  stored  distributions 
to  identify  the  best  match  and  then  process  the  data 
accordingly,  and  (c)  build  an  implicit  model  of  the 
interference  statistics  using  the  real-time  data  and 
process  the  samples  accordingly.  The  material 
presented  in  this  report  reduces  to  practice  all  three  of 
these  approaches. 
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Summary  of  Results. 


The  technical  results  contained  in  this  report  include: 

(a)  formulas  for  optimum  processing  of 
complex  samples  when  the  signal  has  known  and 
unknown  phase  structure, 

(b)  formulas  for  optimum  processing  of 
complex  sample  amplitude,  phases,  and  symmetric 
phase-differences  when  the  signal  has  known  and 
unknown  phase  structure. 

(c)  formulas  for  the  optimum  processing  of  the 
frequency  domain  representation  of  the  signal 
samples  when  the  signal  is  narrowband, 

(d)  explicit  formulas  for  processing  samples 
when  the  statistics  are  modeled  by  either  non-central 
or  central  mixture  models  for  which  the  parameters 
are  estimated  or  implicit, 

(e)  processing  gain  over  traditional  processing 
bounds  and  estimates  in  terms  of  model  parameters 
for  the  above  algorithms  when  the  statistics  are 
modeled  by  either  non-central  or  central  mixture 
models, 

(f)  comparisons  of  processing  gain  bounds 
and  estimates  with  performance  estimates  obtained 
through  simulations  for  two-  and  three-state  mixture 
models, 

(g)  simulation  results  for  averages  of  the 
transformed  samples,  and 

(h)  results  for  the  loss  of  performance  due  to 
target  motion  for  processing  at  the  output  of  a 
beamformer. 

Summary  of  Approach. 


Central  to  our  approach  is  the  maximization  of  the 
deflection  of  a  detector.  The  form  of  the  detector 
maximizing  deflection  is  obtained  through  use  of  the 
Cauchy-Schwartz  inequality  and  is  the  likelihood  ratio. 
A  Taylor's  expansion  of  the  probability  density 
function  of  signal  plus  noise  about  signal  zero  is  used 
to  express  the  probability  density  function  of  the  signal 
plus  noise  in  terms  of  the  probability  density  function 
of  noise  alone.  If  the  signal  has  known  structure,  the 
optimum  detector  is  dominated  by  the  linear  terms  in 
the  Taylor's  expansion  and  practical  algorithms  are 
obtained  by  replacing  the  signal  by  the  signal  divided 
by  its  norm  and  using  only  the  linear  term  of  the  Taylor 
expansion;  if  the  signal  has  unknown  structure,  it  is 
reasonable  to  suppose  that  the  mean  values  of  the 
real  and  imaginary  components  of  the  signal  are  zero 
and  as  a  result  the  quadratic  terms  of  the  Taylor's 
expansion  dominate  and  practical  algorithms  are 


obtained  by  replacing  the  second  order  signal  terms 
by  signal  variances  and  using  only  the  quadratic  term 
of  the  Taylor  expansion. 

Amplitude  and  phase  algorithms  are  obtained  by 
converting  from  inphase/quadrature  to  polar 
coordinates  and  assuming  that  the  sample  amplitudes 
and  phases  are  uncorrelated.  Amplitude  and  phase 
processing  then  become  parallel  processes.  For 
wideband  signals,  the  signal  can  be  recovered  from  a 
symmetric  phase-difference,  so  that  the 
preprocessing  step  to  remove  interference 
sample-to-sample  phase  correlation  by  replacing  the 
phases  by  symmetric  phase-differences  leads  to 
effective  phase  processing  for  many  cases  when 
processing  of  phase  itself  would  be  ineffective. 

The  recognition  that  the  use  of  the  Gaussian  kernel 
representations  of  the  probability  density  function  of 
amplitudes  and  symmetric  phase-differences  could  be 
viewed  as  making  use  of  an  implicit  non-central 
Gaussian  mixture  model,  provides  the  key  to  unifying 
algorithms  discovered  for  communications  and 
algorithms  developed  for  undersea  surveillance  using 
central  Gaussian  mixture  models.  We  obtain 
processing  gain  bounds  defined  by  the  ratio  of  the 
deflection  of  the  adaptive  locally  optimum  processing 
detector  deflection  and  the  deflection  for  the  traditional 
detector  in  terms  of  the  mixture  model  parameters. 

We  verified  through  simulations  that  for  two-state  and 
three-state  mixture  models,  performance  measured  by 
deflection  and  performance  defined  by  probability  of 
detection  for  a  specified  probability  of  false  alarm  are 
highly  correlated. 

Finally,  we  examined  the  performance  of  line  detectors 
obtained  by  averaging  of  sample  detector  values.  For 
traditional  undersea  surveillance  applications,  when 
the  processing  is  used  after  beamforming,  this  is 
closely  related  to  "eye  integration"  used  to  detect  the 
presence  of  narrowband  signals  in  time  histories  of 
Fourier  coefficient  magnitudes  represented  by  a  grey 
scale  display  as  a  function  of  frequency  (abscissa)  and 
time  (ordinate).  For  HGI  arrays  with  spatial  cells,  the 
spatial  resolution  of  the  array  will  be  high  and  it  is 
desirable  to  replace  "eye  integration"  by  automated 
processing.  For  this  situation  the  average  detector  is 
an  upper  bound  for  obtainable  performance  and  we 
obtained  some  preliminary  results  on  the  degradation 
of  performance  due  to  unknown  target  motion  for 
several  candidate  tracking  algorithms. 
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Conclusion. 


Simulations  and  processing  of  real  data  indicate  that 
adaptive  locally  optimum  processing  can  provide 
significant  gains  over  traditional  processing.  A 
satisfactory  theory  now  exists  that  leads  to  practical 
algorithms  to  implement  this  processing  for  many 
communication  and  undersea  surveillance 
applications. 

Adaptive  Locally  Optimum 
Processing. 

Middleton  (1960)  developed  a  statistical 
communication  theory  that  addressed,  among  other 
topics,  optimum  ways  to  detect  weak  communication 
signals  in  the  presence  of  non-Gaussian  interference. 
His  original  theory  addressed  the  estimation  of  a 
communication  signal  from  the  probability  density 
function  of  received  signal  plus  interference.  Later 
Middleton  (1966.  1967,  1977,  1983,  1984,  1991)  and 
Middleton  and  Spaulding  (1983.  1986)  extended  the 
theory  to  include  communication  and  undersea 
surveillance  applications  by  introducing  Gaussian 
mixture  models. 

Within  a  general  statistical  detection  framework,  the 
decision  of  whether  a  signal  is  present  or  not  is  usually 

based  on  the  likelihood  ratio  ,  where  the 

numerator  is  the  probability  density  function  for 
vectors  of  complex  samples  z  containing  signal  plus 
noise  and  the  denominator  is  the  probability  density 
function  for  vectors  of  complex  samples  z  containing 
noise  alone.  For  many  applications,  the  challenge  in 
implementing  optimal  processing  is  to  obtain  an 
estimate  of  p  :(z)  given  a  received  signal  sample  that 
may  contain  signal  as  well  as  noise.  Techniques  exist 
to  solve  this  problem  when  the  signal  is  stronger  than 
the  noise.  However,  these  techniques  are  of  little 
interest  in  surveillance  because  the  signals  would  be 
detected  by  using  traditional  techniques. 

Among  the  approaches  to  implement  detection 
algorithms  are  those  using  the  likelihood  ratio, 
approximations  to  the  likelihood  ratio,  minimax  criteria, 
and  nonparametric  techniques.  A  test  for  the 
presence  of  signal  based  on  the  likelihood  ratio 
provides  the  maximum  probability  of  detection  at  a 
given  false-alarm  rate  (Poor,  1988;  Poor  and  Thomas, 


1978;  Whalen,  1971),  but  it  requires  knowledge  of  the 
signal  plus  noise  and  noise  only  probability  densities 
The  probability  density  function  of  noise  is  difficult  to 
obtain  from  observed  signal-plus-noise  samples 
unless  the  signal  is  dominated  by  the  noise. 

A  minimax  approach  can  be  used  to  account  for 
uncertainty  in  the  class  of  distributbns  that  describe 
the  noise  statistics.  Such  an  approach  is  based  on 
the  designation  of  a  cost  function  and  classes  of 
possible  noise  densities  and  signal-plus-noise 
densities.  The  Bayes  or  Neyman-Pearson  criteria  can 
be  used  to  define  a  cost  function,  and  the  e 
-contaminated  class  of  densities  is  often  used 
(Kassam  and  Poor,  1985;  Poor,  1988).  The  minimax 
algorithm  selects  the  detector  that  minimizes  a 
maximum  cost  over  the  classes  of  density  functions. 
For  example,  whereas  the  matched  filter  implements  a 
likelihood  ratio  for  known  signals  in  stationary 
independent  Gaussian  noise  (Berry,  1981,  Poor, 

1988),  the  correlator-limiter  is  the  minimax  detector  for 
a  known  signal  in  stationary  independent  noise  with  a 
distribution  belonging  to  the  class  of  e- contaminated 
mixture  distributions  with  a  nominal  Gaussian 
distribution  (Kassam  and  Poor.  1985;  Poor,  1988). 
Another  minimax  approach,  perhaps  more  common,  is 
to  minimize  the  maximum  cost  (or  risk)  over  the  class 
of  prior  piobabilities  (Whalen,  p  135. 1971). 

More  prosaic  techniques  are  also  related  to  optimal 
detection.  One  such  technique  is  the  use  of  various 
kinds  of  clippers  or  automatic  gain  controls  in  military 
radios,  designed  to  reduce  the  impact  of  impulsive 
noise  or  interference  on  the  reception  of 
communication  signals  (Blachman.  1964,  1971a, 
1971b,  1982. 1992;  Arnstein  1991, 1992).  These 
techniques  are  especially  effective  for  very  low 
frequency  (10  to  30  kHz)  and  low  frequency  (30  to  60 
kHz)  communications,  when  environmental  noise 
dominated  by  lightning  generated  interference  can  be 
well  modeled  as  an  additive  sum  of  Gaussian  noise 
and  high-power  short-duration  pulses. 

One  of  the  most  common  approaches  to  obtaining 
estimates  of  the  probability  density  function  for  signals 
plus  noise  is  to  relate  this  probability  density  function 
to  the  one  for  noise  alone  under  the  assumption  that 
the  signal  is  weaker  than  the  noise.  This  theory  is  of 
interest  for  detecting  communication  signals  in  the 
presence  of  jamming  and  for  detection  of  masked 
submarine  lines  in  undersea  surveillance.  The 
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optimum  processing  techniques  developed  under  the 
assumption  that  the  signal  is  weaker  than  the  noise  is 
known  as  adaptive  locally  optimum  processing 
techniques. 


The  basic  idea  leading  to  tr.v^  various  locally  optimum 
processing  techniques  to  expand  Pn+,{z)  in  a  Taylor 
series  expansion  about  s  =  0.  By  using  this 
expansion,  one  obtains  an  approximation  for  the 

Pn-¥s{Z) 


likelihood  ratio 


Pn(z) 


that  is  valid  when  s  is  small. 


The  application  of  adaptive  locally  optimum  processing 
to  a  particular  problem  entails  (1)  estimating  the 
probability  density  function  and  (2)  calculating  the 
transformation  of  the  data  to  determine  the  presence 
of  the  signal.  We  briefly  survey  the  different  locally 
optimum  algorithms  that  have  been  developed. 


The  different  algorithms  arise  from  the  different  ways 
of  modeling  the  signal  and  the  noise.  There  are  three 
general  approaches  to  modeling  the  noise; 
parametric,  nonparametric,  and  model  fitting.  The 
parametric  and  nonparametric  techniques  are 
discussed  in  detail  for  various  mixture  models  of  the 
noise  in  this  section.  The  model  fitting  technique  offers 
an  alternative  approach.  The  information  required  to 
implement  ft  is  developed  through  an  analysis  of  the 
parameter  estimation  techniques. 


Parametric  estimation  proceeds  by  assuming  that  the 
probability  density  function  of  the  noise  is  from  a 
family  of  probability  density  functions  described  by  a 
finite  set  of  parameters.  Given  the  data,  the 
parameters  are  estimated  from  the  data  and  in  this 
way  a  probability  density  function  is  chosen  to 
represent  the  samples.  The  appropriate 
transformation  of  the  data  samples  can  then  be 
calculated  from  the  chosen  probability  density 
function. 


Middleton  studied  probability  density  functions,  called 
Gaussian  mixture  models,  that  are  described  by 
infinite  sums  of  Gaussian  distributions.  The  family  of 
probability  density  functions  is  described  by  either  two 
or  three  parameters.  Given  these  parameters,  the 
optimum  transformation  of  the  received 
signal-plus-interference-plus-noise  samples  to 
estimate  the  signal  can  be  calculated.  In  particular, 
Middleton  (1966,  1967,  1977,  1983,  1984,  1991)  and 
Middleton  and  Spaulding  (1983,  1986)  have 
formulated  adaptive  locally  optimum  detectors  based 
upon  Middleton’s  class  A  noise  model.  See  appendix 


B  for  a  discussion  of  the  Middleton  Class  A  noise 
model  The  Middleton  class  A  noise  model  is 
especially  appropriate  for  the  detection  of 
communication  signals  in  impulsive  interference,  and 
has  been  suggested  for  modeling  underwater  acoustic 
noise  by  Middleton. 

Bouvet  and  Schwartz  (1988,  1989)  compare  the 
performance  of  the  likelihood  ratio  detector  derived 
from  a  two-state  Gaussian  mixture  model,  the 
matched  filter,  and  the  correlator-limiter,  for  detection 
of  known  signals  in  shipping  noise  measured  at  sea. 
They  showed  that  the  performances  of  the  matched 
filter  and  the  correlator-limiter  are  similar  and  the 
likelihood  ratio  detector  based  on  a  Gaussian  mixture 
nKKlel  provides  improved  performance  at  some 
faise-alarm  probabilities  and  signal-to-noise  ratios. 
Baker  and  Gualtierottb  (1986,  to  appear)  have 
developed  likelihood  ratio  detection  algorithms  for  a 
general  class  of  signals  in  circularly  invariant  noise, 
which  generalizes  the  Gaussian  mixture  model. 

These  generalized  Gaussian  mixture  models  play  a 
central  role  in  our  theory  of  adaptive  locally  optimum 
for  signals  of  unknown  structure. 

For  the  nonparametric  approach,  an  empirical  nnodel 
of  the  interference  probability  density  function  is 
constructed  from  samples  of  the  interference.  The 
techniques  for  estimating  the  probability  density 
function  include  fitting  polynomials  to  histograms, 
using  kernel  representations,  and  obtaining  the 
estimates  from  finite  differences  of  quantiles  of 
ordered  samples  (which  represent  an  estimate  of  the 
cumulative  probability  function  of  the  samples). 

Recently,  both  the  Air  Force  and  Navy  have  funded 
extensive  efforts  to  investigate  the  implementation  of 
adaptive  locally  optimum  processing  techniques  in 
military  radios.  Both  of  these  efforts  focused  on 
real-time  estimation  of  the  probability  density  function, 
motivated  by  the  consideration  that  jamming  signals 
may  have  structure  with  very  general  statistical 
properties  that  are  under  the  control  of  an  adversary. 
The  Air  Force  effort  was  undertaken  by  Hazeltine 
Corporation  for  Rome  Air  Development  Center 
(RADC)  (Murphy.  Tilley,  and  Torre,  1990)  and  the 
Navy  effort  was  initiated  by  Johns  Hopkins  University 
Applied  Physics  Laboratory  (JHU/APL)  (Higbie,  1988). 

The  Air  Force  effort  focused  on  processing  the  real 
and  imaginary  components  of  the  baseband  samples 
based  on  fitting  a  polynomial  to  the  histogram  of 
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received  signal  samples.  This  is  a  very  natural 
approach  to  describe  the  statistics  and  obtaining  a 
differentiable  probability  density  function  from  which 
the  likelihood  function  can  be  calculated.  Their  work 
involved  a  general  theory  of  how  best  to  fit  a 
polyrK}mial  to  various  probability  density  functions  and 
the  performance  achievable  by  using  adaptive  locally 
optimum  processing. 

An  alternative  approach  to  estimating  statistics  for 
time  domain  processing  has  been  developed  for  the 
Navy  by  Higbie  (1988).  He  estimates  the  probability 
density  function  of  amplitudes  and  phase-differences 
by  finite  difference  of  quantiles  of  the  cumulative 
probability  density  functions  of  sample  amplitudes  and 
phase-differences.  He  obtains  the  quantiles  by  sorting 
a  block  of  successive  received  signal  baseband 
sample  amplitudes  and  phase-differences  according  to 
their  magnitudes.  The  algorithm  is  a  sliding  block 
algorithm  with  the  samples  used  symmetrical  around 
the  sample  being  transformed.  Laboratory  tests  of 
the  algorithms  have  shown  that  the  techniques  provide 
better  detection  performance  for  bandspread 
communications  signals  in  the  presence  of  wideband 
interference  than  any  previously  implemented 
techniques. 

Another  family  of  locally  optimum  processing 
techniques  has  been  developed  by  Bond  (1991)  and 
Bond  and  Hui  (to  appear)  as  a  Navy  effort .  Bond  and 
Hui  developed  their  algorithms  from  kernel 
representations  of  probability  density  functions  and 
studied  the  resulting  algorithms  from  an  analytical 
point  of  view.  Bond  found  that  processing  amplitudes 
and  phases,  as  well  as  phase-differences,  could  be 
quite  effective.  His  adaptive  locally  optimum 
processing,  like  that  of  Higbie,  does  not  require  that 
the  probability  density  function  of  the  noise  be 
described  by  a  few  parameters. 

The  model  approach  uses  preprocessing  to  identify 
typical  probability  density  functions  describing  the 
statistics  of  signal  plus  noise  that  might  be  received. 

A  family  of  these  density  functions  is  then  selected 
and  stored  along  with  the  optimal  processing  to  use  for 
each  case.  The  incoming  samples  are  then  processed 
to  obtain  the  best  match  between  their  density  function 
and  a  stored  density  function.  The  samples  are  then 
processed  by  using  the  processing  associated  with  the 
stored  density  function  that  best  fits  the  incoming  data-. 
In  this  approach,  the  probability  of  detection  is 
conditioned  on  the  comparison  of  the  empirical  density 


of  received  signal  plus  noise  or  noise  only  with  the 
stored  probability  densities 

SchIoz  and  Giles  (1990,  1992)  and  Schioz  (1991, 
1992)  implemented  the  model  approach  by  obtaining  a 
family  of  candidate  distributions  of  the  received  signal 
plus  interference  and  noise  by  preprocessing  data 
samples  representative  of  the  channel  for  which  the 
processing  is  to  be  used.  Then  the  distribution  of 
successive  blocks  of  samples  of  the  received  signal 
plus  interference  and  noise  are  compared  to  each  of 
the  candidate  distributions  in  real  time  and  the 
processing  of  these  samples  is  based  on  the  best 
match.  This  approach  is  easy  to  implement  in  a 
digital  receiver  with  substantial  processing  capability 
and  memory,  because  the  distribution  comparisons 
can  be  done  in  parallel  and  the  parameters  describing 
the  optimum  processing  stored  in  memory. 

Adaptive  locally  optimum  processing  could  be 
effectively  used  before  beamforming,  for  moderate  or 
short-range  surveillance  applications  involving  few 
hydrophones,  because  in  these  cases  the 
interferer-to-signal  ratio  at  the  hydrophone  level  and  at 
the  beamformer  output  level  will  often  be  slightly 
different.  In  contrast,  for  ocean  basin  surveillance, 
many  of  the  interfering  signals  that  might  mask 
surveillance  signals  of  interest  would  not  stand  out 
from  the  general  background  noise  at  the  hydrophone 
level,  precluding  the  use  of  the  adaptive  locally 
ophmum  processing. 

Adaptive  locally  optimum  processing  techniques  could 
be  used  after  a  time  domain  beamformer  and  before 
spectral  analysis.  For  a  frequency  domain 
beamformer,  the  output  could  be  transformed  back  to 
the  time  domain  or  processed  in  the  frequency 
domain.  The  noise  statistics  depend  on  the 
bandwidth  of  the  frequency  domain  representation 
when  transformed  back  to  the  time  domain  in  the  first 
case  and  the  frequency  resolution  of  the  spectral 
analysis  in  the  second  case.  The  noise  statistics  of 
the  beamformed  data  for  a  particular  interferer  also 
depend  on  the  propagation  modes  of  the  interferer 
signal  to  the  receiving  array.  For  a  matched-field 
beamformer,  the  transformation  back  to  the  time 
domain  may  not  always  be  feasible  because  the 
frequency  dependence  of  the  beamforming. 

Motivated  by  these  considerations,  we  developed 
adaptive  locally  optimum  processing  techniques 
suitable  for  processing  either  time  domain  or 
frequency  domain  signals. 
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In  the  subsequent  sections,  we  present  an  integrated 
theory  of  adaptive  locally  optimum  processing  suitable 
for  processing  beamformed  time  domain  and 
frequency  domain  data.  We  proceed  from  first 
principles  by  showing  how  maximizing  the  deflection,  a 
natural  measure  of  detector  performance  (defined 
below),  leads  to  likelihood  ratios.  Imposition  of  the 
small  signal  hypothesis  allows  the  likelihood  ratios  to 
be  evaluated  in  terms  of  derivatives  of  the  probability 
density  function  of  interference  and  noise.  These 
formulas  involving  derivatives  applied  to  various 
interference  models  lead  to  algorithms  for  processing 
the  data.  Our  treatment  includes  both  parametric  and 
nonparametric  modeling  of  the  interference.  Formulas 
for  first-order  and  second-order  detectors  are 
described  depending  on  whether  the  signal  has  known 
or  unknown  structure.  The  performance  of  the  various 
algorithms  is  then  established  by  analysis  and 
simulation. 

Deflection  and  Likelihood  Ratios. 


A  unified  treatment  of  nonlinear  processing  for  time 
domain  and  frequency  domain  beamformed  data  is 
obtained  by  treating  received  signal  samples  as 
complex  numbers.  For  time  domain  outputs,  the 
complex  samples  are  obtained  by  replacing  the  real 
signal  with  its  analytical  signal.  The  analytic  signal  is 
mathematically  determined  from  the  real  signal  by 
using  the  Hilbert  transform  (Papoulis.  1977).  If  s{t)  is 
the  signal  and  s{t)  is  its  Hilbert  transform,  then 
Sait)  =  sit)  +  isit)  is  the  analytic  signal.  In 
communication  systems,  the  signal  is  often  modulated 
by  a  fixed  frequency  after  reception  and  then  highpass 
filtered.  In  either  case,  the  analytic  signal  is  often 
called  the  baseband  representation  with  the  real  part 
called  the  in-phase  component  and  the  imaginary  part 
called  the  quadrature  component  of  the  baseband 
sample.  Some  authors  call  samples  of  the  analytic 
signals  complex  samples. 

Throughout  this  section,  the  input  to  the  nonlinear 
processing  algorithms  are  complex  samples  of  either 
an  analytic  signal  or  complex  Fourier  coefficients.  We 
adopt  the  following  notation.  Let  Zj  denote  the  j-th 
complex  sample  with  Xj  and  Vj  denoting  its  in-phase 
and  quadrature  components,  respectively,  and 
zj  =  Xj  -  ivj  denote  the  complex  conjugate  of  Zj .  The 
complex  number  Xj  -i-  iVj  also  can  be  represented  as  a 
vector  ixj,yj) .  We  define  the  norm  of  Xj  +  ivj,  or 


equivalently  the  length  of  ixj^Vj),  by 

k  +  iyj\  =  l(X;,y,)l  =  [x]  +yj  ■ 

A  common  starting  point  for  the  development  of 
adaptive  locally  optimum  processing  algorithms  for 
processing  time  domain  and  frequency  domain 
beamformer  outputs  is  provided  by  considering  the 
maximization  of  the  deflection  for  sequences  of 
samples.  A  powerful  idea  in  functional  analysis  is  to 
optimize  a  functional  on  a  space  of  functions  (Kudin, 
1973).  It  often  turns  out  that  the  optimal  function  has 
many  other  desirable  properties.  We  have  found  this 
to  be  the  case  for  deflection  and  make  use  of  it 
throughout  this  subsection  to  obtain  many  useful 
adaptive  locally  optimum  processing  results.  In  the 
next  subsection  we  relate  deflection  to  probability  of 
line  detection  for  a  given  probability  of  false  alarm. 

Defection  measures  (in  noise  standard  deviation 
units)  how  different  the  expected  values  of  the 
detector  is  for  signal  plus  noise  and  noise  alone.  The 
deflection  of  a  detector  is  a  natural  extension  of 
detector  output  signal-to-noise  for  cases  when  the 
expected  value  of  the  detector  under  noise  alone  may 
be  nonzero.  Even  though  deflection  for  a  particular 
detector  only  involves  second-order  statistics, 
maximization  of  deflection  over  a  class  of  functions 
involves  all  the  moments  of  the  probability  density 
functions  of  signal  plus  noise  and  noise  alone. 

Suppose  u  is  any  real-valued  detection  variable. 

Then  the  deflection  6(m)  of  c  is 

5,  ,  E„+siu)-E„iu) 
o(m)  = - zr7~\ - ■ 

where  E„+siu).  E„iu),  and  a„iu)  denote  the 
expected  value  of  the  detection  quantity  for  signal  plus 
noise,  the  expected  value  of  the  detection  quantity  for 
noise,  and  the  standard  deviation  of  the  detection 
quantity  for  noise,  respectively.  It  can  happen  that 
the  detection  problem  is  trivial,  for  example  if  the 
known  distributions  of  signal  plus  noise  and  noise  only 
do  not  intersect  or  if  the  noise  is  zero.  We  preclude 
these  cases  throughout  this  section  by  assuming  that 
the  deflection  exists  and  is  finite  for  the  signal  plus 
noise  and  noise  only  samples  and  by  assuming  that 
p„ix,y)  does  not  vanish  except  on  a  set  with 
probability  0. 

For  the  time  being,  we  restrict  our  treatment  to 
real- valued  detectors  D  :  R,  that  is  to 
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real-valued  functions  of  pairs  of  real  numbers  which 
are  the  real  and  imaginary  components  of  complex 
samples  of  the  analytical  signal.  Furthermore,  we 
assume  that  the  detectors  D  are  functions  with  finite 
deflection.  Our  strategy  is  to  see  how  much  we  can 
learn  about  the  functional  form  of  a  detector  D{x,y) 
with  finite  deflection  by  imposing  the  condition  that  it 
maximizes  the  absolute  value  of  the  deflection.  The 
information  available  about  the  noise  is  the  probability 
density  function  pn{x,y)  =  p(z\n).  Calculation  of  the 
expectation  of  D  given  that  it  contains  signal  and 
noise  requires  the  probability  density  function 
Pn+s{x,y). 


with  equality  if,  and  only  if.  for  some  c  o. 

D(x,y)Jp„(x,y)  =c^ - -  .— - 

JPn{x,y) 


or  equivalently, 

b(x,y)  =  c( 


p„^s(x,y) 

Pnix,y) 


-1). 


Note  that  the  deflection  is  independent  of  c  ^  0  and 
the  constant  1  and  we  can  conclude  that 


Dix,y)  = 


Pn^six,y) 

Pn{.x,y) 


is  also  a  detector  which  maximizes  deflection, 
hereafter  referred  to  as  an  optimum  detector. 


We  first  obtain  the  form  of  the  detector  that  maximizes 
deflection.  Let  D  be  a  detector  with  finite  deflection 
and  let  b{x^y)  =  D{x,y)  -  E„(D(x,y)) .  It  is  clear 
from  the  definition  of  the  deflection  that  5(D)  =  5(D) 

A 

and  En{D{x,y)  =0.  The  square  of  the  denominator 
of  the  deflection  is  simply 

al{b{;c,y))=\\^^  b^ix,y)p„(,x,y)dxdy. 

The  Cauchy-Schwartz  inequality  provides  the 
mechanism  to  determine  the  functional  form  of  the 

A 

optimal  D(x,y).  In  its  general  form,  the  Cauchy- 
Schwartz  inequality  applies  to  any  measure  |X  and  for 
finite  integrals  defined  by  the  measure 

with  equality  if  and  only  if  f=  eg  for  some  constant  c. 
We  have 

E„+s(b(x,y))  -  E„(b(x,y)) 

=JL  b(x,y)[p„^s(x,y)  -p„(x,y)]dxdy 


If  s  =  a  +  ib  and  the  samples  consist  of  signal  plus 
additive  noise,  then  Pn-^s(x,y)  =p„(x- a,y -b).  if  we 
assume  that  the  signal  is  small,  a  Taylor's  expansion 
can  be  used  to  express  this  latter  probability  in  terms 
of  p„{x,y)and  its  derivatives: 
p„{x-a,y-b) 


dx 


-a-- 


dy 


-b] 


2  ^  3x2  dxdy  dy^ 

+  higher  order  terms  in  a  and  b. 


The  optimum  detector  D(x,>’)  given  the  signal  a  +  ib 
is,  up  to  second-order  terms, 

1 _ ^Pn(x,y) 


D{x,y)  =  - 


p„{x,y)  dx 


-a 


_  1  dp„{x,y)^  1  1 

p„{x,y)  3>^  2  p„{x,y) 


d^Pn{x,y) 


3x2 


and  letting  \pix,y)  = 


Pr+s{x,y)  -pAx,y) 
ylPnix,y) 


1  d^p„ix,y)  1  1  ^^Pnix,y) .  ^ 

p„{x,y)  dxdy  2p„{x,y)  dy^ 


<iL  D(x,>')I>  n-^s(X,  y)  -pAx,y)]dx(fyf 

=  { J  [b(x,y)Jp„(x,y)  ]hp{x,y)dxdy^ 

<  [a2(D(x,>’))][ I  {kp{x,y)V-dxdy\. 
Therefore, 

52,0)^ rr 

J  J/?2  Pn{x,y) 


We  call  the  sum  of  the  first-order  terms  in  the  above 
detector  the  first-order  detector  and  the  sum  of  the 
second-order  terms  the  second-order  detector. 

We  next  consider  the  case  when  the  signal  is  not  fixed 
but  is  given  by  5  =  A"-f  iT,  where  X  and  Y  are  random 
variables.  It  can  be  shown,  by  using  techniques  similar 
to  the  known  signal  case,  that  the  optimum  detector 

still  has  the  form  D{x,y)  =  -  1  and  the 

Pn\Xty) 

terms  of  the  form  a" b”  in  the  Taylor  series  expansion 
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should  be  replaced  by  the  expected  values  E(X"Y"). 

If  X  and  Y  are  independent,  then  a"  and  h”  can  be 
replaced  by  the  m-th  moments  of  X  and  Y, 
respectively.  In  particular,  if  X  and  Y  are  independent 
and  identically  distributed  with  zero  mean  and  variance 
,  the  optimum  detector  has,  up  to  second-order 
terms,  the  form 


D{x,y)  = 


1  d^pn(.x,y) 
Pn(x,y)  ^  3x2 


d'^p„ix,y) 


1  ^ 
41 


1  3^p(xt,>^t)  2 


2  *=i  P(Xk,yk)  3x2 


1 


^^p(Xk,yk) 


PiXk,yk)  dxdy 


^k^k 


1 


^2  S  p(xk,yk)  dy 


d^p(Xk,yk)  ^2 


The  detector  output  for  a  number  of  samples,  say 
zi,....zjv,  which  need  not  be  independent,  can  be 
combined  to  give  a  new  detection  quantity 

N  1 

We  call  this  multisample  detector  the  line  detector  in 
anticipation  of  its  use  to  detect  narrowband  signals  of 
interest  in  ocean  basin  surveillance.  For  a  sequence 
of  independent  samples,  the  probability  density 
function  is  the  product  of  the  probability  density 
functions  of  each  sample; 

p(xi,yi,X2,y2,...,XN,yN) 

=pixiyy\)pix2,y2)-p(xN,yN). 

We  have 

_ 1 _ 

pixuyi)pix2,y2):-pixNyyN) 

.  dipix\,yi)pix2,y2)...p{xN,yN)) 

3x, 

_  y  1  ^p{xh,yh) 
piXhyyh)  3x* 

1  ^pixk,yk) 
pixk^yk)  dxk 

Thus,  the  optimum  detector  D{z  i,Z2,...,zn)  given 
the  sample  vector  (zi,Z2,  —,zn)  with  independent 
zi,Z2,  ...,ZAr  is  up  to  second-order  terms. 


which  is  a  constant  times  the  line  detector. 
Therefore,  the  line  detector  is  optimum  for 
independent  samples.  The  performance  of  the  line 
detector  is  discussed  later  in  this  section.  The 
remainder  of  this  subsection  treats  first-order  and 
second-order  detectors. 

Amplitude  and  Phase  Processing. 


For  many  applications,  it  is  nv^re  natural  to  model  the 
amplitudes  and  phases  of  complex  samples  than  their 

real  and  imaginary  components.  Let  A  =  Jx^  +y^ 
and  0  =  arg(x  +  iy)be  the  amplitude  and  phase  of  the 
sample  z  =  x  +  iy.  The  phase  6  should  be  assigned 
to  avoid  discontinuities  of  27t,  which  is  always 
possible  if  A  >0.  The  process  of  assigning  phase  in 
a  continuous  manner  is  known  as  phase  unwrapping 
(Oppenheim  and  Shafer,  1989;  Tribolet,  1977). 
Hereafter,  we  assume  that  all  phases  are  unwrapped 
phases  because  for  some  of  the  processing  algorithms 
discussed  later  it  is  important  to  avoid  unnecessary 
discontinuities  in  the  phase. 

Suppose  p{x,y)  =  ,  where  p{A)  is  the 

probability  density  function  of  the  amplitudes  of  the 
complex  samples  z,  and  ^(0)  is  the  probability 
density  function  of  the  phases  0  of  the  compiex 
samples  z.  Note  that  the  factor  A  is  necessary  so 
that  p(x,y) is  a  probability  density  when  p(A)  is  a 
probability  density  and  when  ^(0)  is  a  probability 
density  because 


D(z,,Z2, 


N 

...,Zs)  =-X 

*=l 


1  dpi,Xk,yk) 
pixk,yk)  9x 


/  p{x,y)dxdy  =  Jq"  \’^p{A  )p{%)dAd%. 


y  1  Bp(xk,yk), 

il  p{Xk,yk)  dy  * 
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It  is  quite  informative  to  express  the  optimum  detector 
in  terms  of  the  partial  derivatives  of  p(A)  and  ^(0) 
that  we  obtained  previously  in  terms  of  p{x,y) .  The 


algebra  is  simpler  if  we  introduce  p(A) 


p{A) 


A 

Observe  ,he,  ^  and 

00  X 

-:r-  =  We  next  calculate  the  partial  derivatives 
dy  A^ 

occurring  in  the  fist-  and  second-order  detectors. 
^^^=lp'w)7l«(e)+p(/<)[9'(e)^i , 
+2[p'(vi)ii(?'(e)^j 


pMWe) 

0x2  dxdy  dy^ 

xa  +  by  jp"iA)  -ya+xh  ,  p' (A) 

^  A  piA)  A  ^  A^A) 
~ya+xb  y  q"(d) 

^  A  ^  A^q(d) 

xy{a}-b'^)  +  ab{y'^-x^)  \  _  p'{A)  q{Q) 

^  A^  ^^A  p{A)^^q{<d)^ 

p^(A)  p"{A')  p\A) 

The  quotients  ^ — -  and  - ^are  related  to  ^  ,  - 

P(A)  p(A)  p(A) 

p^'iA') 

and  ^  ,  J  by  the  following  relations: 

P\A) 


^^Pix,y) 

dydx 


and 


+[p'(/<)?'(e)(i^)l 

+MAng"m^ + 


dMx,y) 

0^2 


+2\p\A)^]lq'{e)-^] 

+p(A)lq"(B)^-^g'm]. 

The  first-order  detector  is 


p'(A)  ^p'(A)  1 

p(A)  p(A)  A 

and 

p"iA)  p"iA)  2 /(A)  K 
piA)  ~  p{A)  A^piA)  A^- 

We  make  the  following  observation  about  the 
first-order  and  second-order  detectors  for  a  small 
signal.  When  a  small  signal  is  present,  the  linear  terms 
are  expected  to  dominate  the  second-order  and 
higher-order  terms  so  that  the  optimum  detector  is 
closely  approximated  by  the  terms  involving  only  first 
partial  derivatives.  Indeed,  for  the  case  of  uniform 
phase  when  the  probability  density  function  factors 
into  probability  density  functions  of  amplitude  and 
phase,  the  second-order  term  has  expected  value  0 
with  signal  present  up  to  fourth-order  signal  terms,  in 
contrast  to  the  linear  term  which  has  nonzero 
expected  value  with  signal  terms  of  second  order.  An 
outline  of  the  argument  follows. 


1 

piA)q(Q) 


Mx,y) 

dx 


+  b 


Mx,y) 

dy 


xa  +  by  p\A)  -ya+xb  q'iQ) 
A  ’p(A)'^^^A2  ^q(d) 


Observe  that 


PnA-s(x,y)  =p„(x  -  a,y  -  b) 


=  p{  J{x-ay  +  {y-by)  g(arctan 


and  the  second-order  detector  is 


=HA- 


ax-¥by 

~~A 


W- 


bx-ay 
A^  ^ 
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=  \p{A)-  ^^^p'iA))][qiQ)  - 

=  [p(y4)-(acos0  +  /)sin6)p'(y4))] 

Now,  we  are  ready  to  calculate 


£«+j[/52(x,y)] 

ax  +  by^p"{A)  bx-ay  jp'{A) 

the  expected  value  of  the  second-order  terms  under 
the  assumption  that  q(Q)  =  By  making  use  of  the 
fact  that  E„{D2{x,y)\  =  0, 

En^ADiix^y)]  = 

-f  I  ((acos0-i-6sin0)^4r^^ 

Jo  Jo  ^  '  p^) 

■\-{b  cos  0  -  a  sin  0)  ^ 

Ap(AY 

x{a  cos  0  4-  A  sin  Q)p\A)AclQdA  =  0 


because 


[(a  cos  0  6  sin  0)  ^</0  =  0 
and 

cosQ+b  sin  0)(6  cos  0  -  a  sin  0)^</0  =  0. 

We  now  give  a  geometric  interpretation  of  the  first- 
order  detector.  Recall  that  the  first-order  detector  has 
the  form: 

Call  the  operation  in  the  first  term  of  the  above  sum 
amplitude  processing  and  the  second  term  phase 
processing.  Let  5  =  fl  -i-  ib  and  z  =  x  +  iy.  Then 
(ox  +  by)/ A  and  {-ay  +  bx)/A  are  the  projections  of  s 
onto  2  and  z-*-  =  iz,  respectively.  Note  that 

ax  +  by  „  f  z  *1  .  -ay  +  bx  ,  ,  z 

— - —  =  /?e[-j-r.y  land - - - =  /w[tt^  ]. 

A  Izl  A  \z\ 

where  s  =  a  +  ib  and  z  =  x+  iy.  Let 

P'iA)  p'iA)  ,1  ^  q'{Q) 

=  =  — TqT- 

p{A)  p{A)  A  ^(0) 


The  amplitude  processing  can  be  viewed  as 
consisting  of  a  first  step 

z*  — >  g{  Iz*  1)7^ ,  where  the  "nonlinearity " 
12*1 

^'(l2*l)  is  used  to  "weight"  the  samples,  followed  by  a 
second  signal  reconstruction  step: 

g(l2tl)-j^  -4g(lzJ)/?e(-j^-j^]  . 

Iz*l  IzJ  l5*l 

Similarly,  the  phase  processing  can  be  viewed  as  a 
two-step  process:  weighting  the  samples  and 
reconstructing  the  signal.  The  real  part  contains  all  the 
signal  information,  so  taking  the  real  part  is  a  natural 
thing  to  do  in  either  the  amplitude  or  phase 
processing.  Note  further  that  the  amplitude  a  ise 
processing  naturally  complement  each  other 
amplitude  processing  depends  on  the  projection  ut  the 
signal  onto  the  received  signal-plus-interference 
baseband  sample  and  phase  processing  depends  on 
the  projection  of  the  signal  onto  the  received 
signal-plus-interference  baseband  sample  rotated 
counterclockwise  by  90“ . 

As  for  in-phase  and  quadrature  quantities,  the  above 
result  can  be  extended  to  unknown  signals  by 
replacing  the  various  powers  of  a  and  b  with  the 
moments  of  the  real  and  imaginary  parts  of  the 
desired  signal. 

We  have  seen  that  the  optimal  detectors  for  amplitude 
and  phase  involve  projections  of  the  signal  onto  the 
received  signal  and  the  received  signal  rotated 
counterclockwise  by  90“ .  Considerable  insight  into 
adaptive  locally  optimum  processing  is  gained  by 
characterizing  the  signal  information  contained  in 
these  projections.  Toward  this  end,  consider  a 
baseband  sample  z  =  u  +  n  +  s  with  u  and  n 
structured  and  random  components  of  the  noise, 
respectively.  The  structured  component  of  the  noise 
is  often  referred  to  as  interference,  while  the  random 
component  is  referred  to  as  noise. 

The  amplitude  and  phase  of  a  complex  baseband 
sample  z  can  be  decomposed  into  vector  components 
parallel  and  perpendicular  to  the  interferer  vector  u  as 
shown  in  figure  1 ,  provided  that  the  signal  and  noise 
are  much  less  than  the  interferer.  Let  and  s± 
denote  the  projections  of  the  vector  s  onto  the  vector  u 
and  the  vector  u  rotated  counterclockwise  by  90“  and 
let  n\]  and  /ix denote  the  projections  of  the  vector  n 
onto  the  vector  u  and  the  vector  u  rotated  counter- 
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clockwise  by  90“.  Let  i’n.S'x, N|l,  and  A^x  denote 
the  real  numbers  defined  by  5||  =  •S'li 

^x  =  -Sx-j^,  /III  =A^l|-jj,  and /ix=A^x-j^,  where 
u±  denotes  the  vector  u  rotated  counterclockwise  by 
90®.  Then  Irl  =  lul  +  i'n  +A^||  and 

0  =  <|>  +  where  (j)  denotes  the  phase  of 

lul  \u\ 
the  interferer. 


Each  of  the  projections  of  the  signal  contains  half  of 
the  available  information  on  the  signal.  This  follows 
from  the  fact  that 


J||  = - - and  s±  = - r - . 


where  s*  is  the  complex  conjugate  of  s.  For  a 
known  signal,  if  good  estimates  of  both  S||  and  Sxare 
available,  their  sum  provides  a  good  estimate  of  the 
signal  up  to  its  magnitude.  In  some  cases,  adaptive 
locally  optimum  processing  of  amplitudes  may  be 
success^l,  while  that  of  phases  is  unsuccessful,  and 
in  other  cases  processing  of  phases  may  be 
successful,  while  that  of  amplitudes  is  unsuccessful. 
For  a  known  signal,  if  a  good  estimate  of  either 
,f|l  or  5xis  available,  the  signal  can  also  be  recovered 
if  the  output  is  despread  for  a  bandspread  signal, 
spectrally  analyzed  for  a  narrowband  signal,  or 
beamformed.  In  each  of  these  cases,  the  signal 
processing  following  adaptive  locally  optimum 
processing  separates  the  desired  signal  term  s  from 
the  undesired  signal  distorting  term  because  of 


the  presence  of  (j). 


As  an  example,  let  us  consider  direct  sequence 
bandspread  communications  in  some  detail.  The 
known  signal  detection  by  correlation  in  this  case  is 
called  despreading.  This  constitutes  the  situation  for 
which  most  of  the  theoretical  work  on  adaptive  locally 
optimum  processing  for  communications  has  been 
applied.  Let  s  be  the  spreading  sequence  (the  signal 
of  known  structure).  Let  B  denote  a  set  of  successive 
baseband  samples  of  the  received  signal  associated 
with  a  given  bit.  Then  despreading  consists  of  the 
complex  sample  correlation: 


^7  I  I  • 
Zj  in  B  15’yl 


Observe  then  that  the  outputs  of  the  amplitude  and 
phase  algorithms  after  despreading  are 


sj  +  is] )  s*  sj  -  {s] )  s* 

- 2 - M  “  ? - 2 - M  ' 

respectively.  The  undesirable  term 

Sj 


Zj  in  B 


1 5;  I 


is  small  compared  with  the 


5,  Sj 


,  provided  the  phase  of  i/ 


desirable  term  S  i  i 

z/mB  ^ 

is  uncorrelated  with  the  phase  of  Sj,  a  technical 


condition  that  is  usually  met,  and  the  number  of  chips 
spreading  a  bit  is  at  least  10,  another  condition  that  is 
traditionally  far  exceeded  by  existing  bandspread 
communication  signals.  This  same  argument  applies 
to  a  narrowband  signal  whose  structure  is  known 
when  the  interference  has  phase  that  can  be  modeled 
as  random.  This  is  one  reason  why  detection 
techniques  developed  for  communications  have 
application  to  undersea  surveillance. 


For  a  signal  of  unknown  structure,  there  is  no  question 
of  recovering  the  signal.  However,  it  may  still  be  of 
considerable  interest  to  detect  its  presence,  such 
would  be  the  case  in  the  frequency  domain  when  the 
phase  of  succesive  complex  Fourier  coefficients  for  a 
given  frequency  cannot  be  estimated  and  it  is 
desirable  to  decide  whether  there  is  a  narrowband 
signal  present  at  that  frequency.  However,  even  for 
the  case  of  unknown  signal,  it  is  still  of  interest  to 
perform  both  amplitude  and  phase  processing,  if 
possible,  because  amplitude  processing  provides 
energy  proportional  to  the  projection  of  the  signal  onto 
the  received  vector.  For  a  signal  vector  rotating 
around  the  interferer,  this  leads  to  a  detector  output 
over  time  with  periods  indicating  the  presence  of  the 
signal  separated  by  periods  in  which  it  does  not 
appear.  Successful  processing  of  phase  tends  to  fill 
in  the  segments  when  the  signal  does  not  manifest 
itself  after  amplitude  processing. 


From  our  discussions  above,  we  see  that  the 
processing  of  phase  uses  information  about  the 
projection  of  the  signal  onto  the  interferer  rotated 
counterclockwise  by  90“  and,  therefore,  depends  on 
the  relative  phases  of  the  signal  to  the  interferer 
component  (or  the  dominant  noise  component  when 
interferer  is  not  present).  Furthermore,  for  the  phase 
decomposition,  signal  phase  and  noise  phase  divided 
by  amplitude  are  approximated  by  the  projection  of  the 
signal  vector  onto  the  dominant  interferer  vector 
rotated  counterclockwise  by  90“ .  For  this  reason,  the 
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performance  of  phase  processing  is  always  somewhat 
dependent  on  received  signal  baseband  amplitude 
statistics.  Another  consideration  related  to  amplitude 
and  phase  processing  is  that  the  processing  is  nearly 
optimum  when  the  interferer  is  much  stronger  than  the 
desired  signal.  However,  in  many  cases,  an 
interferer-to-signal  ratio  of  two  is  sufficient  for  the 
processing  to  be  more  effective  than  traditional 
processing.  When  applicable,  it  is  always  better  to 
perform  both  amplitude  and  phase  processing. 

Signal  Models. 


Different  nrKxlels  for  the  signal  lead  to  different 
processing  algorithms.  We  discuss  the  following  ; 

(a)  signal  of  known  structure  and  unknown 
received  power,  and 

(b)  signal  of  unknown  received  power  and 
independent  amplitude  and  uniform  phase. 

Case  (a)  includes  processing  bandspread 
communication  signals  and  the  use  of  spectral 
analysis  to  detect  a  narrowband  signal  of  unknown 
frequency.  For  bandspread  signals,  when  each  bit  is 
spread  by  a  known  chip  sequence,  the  timing  of  the 
received  signal  and  the  internally  stored  chip 
sequence  is  aligned  through  the  process  of 
synchronization.  Synchronization  is  usually  done 
through  processing  of  known  bits.  Information  bits 
are  detected  by  correlation  over  the  bit  duration  of  the 
received  signal  with  the  stored  chip  sequence  for  that 
time  interval.  Thus  case  (a)  applies  to  those  adaptive 
locally  optimum  processing  techniques  used  to  detect 
bandspread  communication  signals.  For  detecting  a 
narrow  band  signal  by  using  spectral  analysis,  we  can 
view  the  Fourier  transform  as  the  simultaneous 
correlation  of  the  received  signal  with  a  family  of 
candidate  signals  of  known  structure.  In  this  sense, 
case  (a)  applies  to  locally  optimum  processing  of  time 
domain  signals  that  are  then  spectrally  analyzed  to 
detect  the  presence  of  narrowband  signals. 

Case  (b)  applies,  in  particular,  to  the  detection  of 
broadband  signals  through  energy  detection.  For  such 
signals,  it  is  reasonable  to  suppose  that  the  phase  is 
uniformly  distributed  on  {-n,  ti]  and  as  a  result  the 
mean  values  of  the  real  and  imaginary  signal 
components  are  zero.  Case  (b)  can  be  reduced  to 
case  (a)  for  the  detection  of  the  presence  of  a 
narrowband  signal  for  which  the  center  frequency  of  a 


Fourier  frequency  bin  is  nearly  the  same  as  the 
frequency  of  a  received  narrowband  signal  received  in 
a  single  mode.  In  this  case  the  signal  component  of 
the  complex  Fourier  coefficient  can  be  modeled  as  a 
fixed  unit  vector  of  unknown  phase  times  an  unknown 
constant. 

For  a  signal  of  known  structure,  we  have  seen  in  the 
previous  sections  how  the  optimum  detector  can  be 
implemented  given  a  probability  density  function  of  the 
interference.  We  have  also  discussed  the 
implementation  given  the  probability  density  functions 
of  interferer  amplitude  and  phase  under  the 
assumption  that  they  are  independent.  We  now  briefly 
discuss  the  unknown  signal  structure  case. 

Signal  of  Unknown  Structure  and  Uniform  Phase. 


In  this  case,  is  assumed  unknown.  To 

l.s*l 

implement  the  optimum  detector,  we  replace 
Ok  by  =  0,  bk  by  £(/>*)  =  0,a*  by  Eia]), 
Okbk  by  EiOkbk),  and  bl  by  E(bl).  In  addition,  if  the 
amplitude  and  phase  of  the  signal  are  assumed 
independent,  a  natural  assumption  for  signals  with 
uniform  phase,  the  following  hold 

E(al)  =  cosV*)  =  j£0sk\^). 

EiOkbk)  =  Ei\sk\^  cosxj/*  sinxiTit)  =  0,and 

Eibl)  =  Ei\sk\^  sin^V*)  =  i-£(k|2). 

In  any  case,  the  only  natural  way  to  implement  the 
second-order  detector  is  to  treat  the  unknown  nonzero 
parameters  as  equal.  With  the  above  assumption, 

the  second-order  detector,  with  c  = - — , 

2£(l5|2) 

reduces  to 

1  d^p{x,y)  BMx,y)  n 

p{x,yr  9x2  0^2 

which  can  be  implemented  given  that  an  estimate  of 
the  probability  density  function  p{x,y)  is  available. 

If  in  addition,  the  interferer  is  assumed  to  have 
independent  amplitude  and  phase  and  the  phase 
uniform,  the  second-order  detector  becomes 

p"{A)  p'iA)  p"{A)  1  p'{A)  1 

p{A)  Ap(A)  p(A)  A^p(A)  A^' 


12 


Narrowband  Nonzero-Mean  Frequency  Domain 
Signal. 


If  adaptive  locally  optimum  detection  is  to  be  applied  in 
the  frequency  domain,  and  the  desired  signal  is 
narrowband  and  stable,  this  information  can  be  used 
to  reduce  the  general  detector  for  frequency  domain 
signals  to  two  cases  that  correspond  closely  to  the  two 
cases  already  discussed  for  time  domain  signals.  One 
case  occurs  when  the  center  frequency  of  the  Fourier 
transform  frequency  bin  containing  the  signal  and  the 
frequency  of  the  signal  are  very  nearly  the  same,  so 
that  the  Fourier  coefficient  of  the  signal  is  neatly 
stationary.  In  this  case,  the  signal  can  be  modeled  by 
a  unit  vector  u  times  the  expected  value  of  the  signal 
magnitude.  The  second  case  occurs  when  the 
complex  Fourier  coefficient  of  the  signal  rotates 
around  the  origin  so  that  during  the  detection  interval 
the  expected  values  of  the  real  and  imaginary 
components  of  the  signal  can  be  taken  as  zero.  We 
treat  the  nonzero-mean  case  in  this  subsection  and 
the  zero-mean  case  in  the  next  section. 


Let  the  signal  vector  be  approximated  by  (u,  v)Utl 


with  (u,  v)  an  unknown,  but  fixed,  unit  vector.  Let 
{(Mm,  Vb,)I  1  <  w  ^  A/}  be  a  set  of  unit  vectors 
corresponding  to  equally  spaced  angles  between 
0  and  n.  After  the  processing,  the  magnitude  of  the 
result  is  the  quantity  examined  to  determine  the 
presence  of  the  narrowband  signal  so  that  -(m,  v)  and 
(u,  v)  would  provide  the  same  performance.  Then, 
this  case  is  reduced  to  the  signal  of  known  structure 
case  as  follows.  Let 


r>(xi  Jfvj'v)  =  max  ^ 

m  yy 


S  ^m(Xk,yk) 


k=l 


where 

N 


N 


*=1 


NT^p{Xk,yk)  Bx 

,  ^  1  Bp{xk,yk) 


*=i  p{Xk,yk)  By 


Vm] 


is  a  line  detector.  A  minimal  set  of  unit  vectors  to 
implement  the  above  detector  would  be 


{(m  ffi)  )ll  <m<4}  = 


{(1,0), (0,1), -^(1,1),  ^(1,-1)}. 


The  argument  presented  for  a  signal  of  known 
structure  can  then  be  applied  with  Sk  =  {Um,v„)  and 
with  m  chosen  as  the  value  leading  to  the  maximum 

value  of  the  line  detector  Dm{x,y).  In 

^  *=i 

particular,  for  independent  amplitude  and  uniform 
phase,  the  second  partial  derivatives  term  of  the 
detector  has  expected  valued  zero-order  up  to  fourth- 
order  in  the  signal  strength,  while  the  detector  given 
here  has  nonzero  expected  value  with  second-order 
terms  in  signal  strength. 

Narrowband  Zero-mean  Frequency-domain  Signal. 


There  are  two  ways  that  this  case  can  occur;  one  way 
is  that  the  signal  vector  is  rotating  around  the  origin  at 
a  fixed  rate.  This  case  is  not  of  much  interest  to  us, 
for  presumably  the  center  frequency  of  the  Fourier  bin 
containing  the  signal  will  have  frequency  quite  close  to 
that  of  the  signal,  and  as  long  as  the  integration  time  is 
not  too  long,  the  above  case  will  occur.  The  other 
way  is  that  the  signal  vector  can  be  modeled  as  having 
random  phase.  In  either  case,  the  signal  can  be 
treated  as  unknown  as  in  the  second  case  considered. 
In  addition,  the  linear  partial  derivative  terms  have 
zero  expected  value  when  a  signal  is  present. 

Summary  on  Signal  Modeling. 


The  signal  model  can  be  taken  as  either  of  known 
structure  or  of  unknown  structure.  Furthermore,  for 
processing  independent  amplitudes  and  uniform 
phases,  the  two  cases  are  complementary.  For  the 
known  structure  signals,  the  detector  given  by  the  first 
partial  derivative  terms  dominates  the  detector,  while 
for  the  unknown  signal  case,  the  detector  given  by  the 
second  partial  derivatives  dominate. 

For  processing  frequency  domain  beamformed  data, 
the  natural  approach  is  to  perform  processing  under 
both  assumptions  and  add  the  results.  In  this 
manner,  regardless  of  which  assumption  on  the 
received  signal,  nonzero-mean  real  or  nonzero-mean 
imaginary  part,  or  zero  mean  real  and  zero-mean 
imaginary  parts,  best  describe  the  signal,  and 
depending  to  some  extent  on  the  interference  first  and 
second  partial  relative  magnitudes,  the  combined 
processing  should  improve  detection  of  the  signal. 
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Interference  Models. 


In  the  following  section,  we  discuss  the  adaptive 
locally  optimum  algorithms  that  arise  for  different 
interference  models  for  the  cases  of  signals  of  either 
known  structure  or  unknown  structure.  We  first 
discuss  the  case  of  Gaussian  noise  to  show  that  the 
adaptive  locally  optimum  processing  reduces  to 
traditional  processing.  We  then  show  that  appropriate 
preprocessing  can  sometimes  transform  a  baseband 
sample  to  a  quantity,  still  containing  a  recoverable 
signal  term,  which  is  Gaussian  even  though  the 
original  sample  contains  structured  interference.  This 
approach  leads  to  important  algorithms  for  undersea 
surveillance,  in  particular  to  algorithms  that  can  be 
used  to  detect  weak  narrowband  signals  masked  by 
stronger  narrowband  signals  of  nearly  the  same 
frequency.  The  algorithms  based  on  preprocessing 
can  be  viewed  as  special  cases  of  a  more  general 
class  of  adaptive  locally  optimum  processing 
algorithms  that  are  derived  from  modeling  the 
interference  plus  noise  by  multistate  models.  The 
preprocessing  is  used  to  assign  the  samples  to  an 
interference  model  with  a  single  state,  while  for  the 
more  general  multistate  models,  the  samples  are 
assigned  probabilistically  to  the  states. 

Gaussian  Noise. 


The  simplest  interference  model  for  a  real  variable  x  is 
that  the  probability  density  function  of  x  is 


p{x)  = 


1 


e  with  the  known  variance  of 


Jilt  a 

the  Gaussian  noise.  For  this  case, 

dx  pix)  dx^ 


and  as  a  result,  the  first-order  detector  reduces  to  a 
linear  transformation  and  the  second-order  detector  to 

its  square  up  to  the  constant  . 


When  the  successive  samples  contain  an  interferer 
term  that  is  correlated,  it  is  sometimes  possible  to 
remove  this  correlation  by  processing  appropriate 
linear  combinations  of  the  original  samples  with  the 
result  that  the  only  interference  remaining  is  Gaussian, 
that  is  the  remaining  interference  can  be  modeled  by  a 
one-state  Gaussian  mixture  model.  We  present  two 
examples,  an  amplitude  case  and  a  phase  case,  for 


which  the  results  are  surprising,  have  important 
applications  to  undersea  surveillance,  and  are  new. 

One-state  Amplitude  Model. 


Suppose  that  a  small  signal  is  received  in  the 
presence  of  a  large  constant  amplitude  interferer  and 
modest  levels  of  Gaussian  noise.  Suppose  further 
that  successive  sample  signal  components  are 
uncorrelated  and  independent  of  the  sample  interferer 
components.  Consider  the  amplitude  preprocessing 
step 


1  ^ 

Aj^Aj-—  2  Aj^. 


h^NJM 


The  signal  term  in  the  preprocessor  output  is 


1 

Projzpj)--:^  Z  projz^,{sj^t). 

k=^NJc*0 

The  desired  signal  term  in  this  expression  depends  in 
general  on  N,  the  sample  rate,  and  the  relative  phase 
of  the  signal  and  interference.  The  term 

—  Z  (5)+*)  can  be  viewed  as  a  signal 

k=^NJc*0 

distortion  term.  This  term  can  be  shown  to  be  small 
under  quite  general  conditions.  We  outline  the 
argument  for  two  cases:  (1)  random  phase  angles 
between  the  signal  and  interference  and  (2)  linear 
phase  difference  between  the  signal  and  interference. 
The  first  case  usually  applies  when  either  the  signal  or 
interference  is  a  broadband  signal,  and  the  second 
case  applies  when  both  the  signal  and  interference  are 
narrowband  signals. 


The  signal  distortion  term  can  be  viewed  as  the 
distance  from  the  origin  of  a  one-dimensional  random 
walk  when  the  relative  phase  of  the  signal  and 
interference  are  random.  The  individual  steps  are  the 
projections  pray z^j(j)+i).  The  expected  distance 
from  the  origin  is  roughly  proportional  to  the  square 
root  of  the  number  of  steps  2N  times  an  average  step 
size.  For  the  amplitude  case,  this  distance  is 

2^  2j2N 

because 

avgi  Iprojz^^  isj+k )  I )  =  javgU  I . 

This  means  that  the  distortion  energy  is  expected  to 
be  times  the  signal  energy.  Thus,  a  desire  to 
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neglect  the  signal  distortion  term  imposes  a  mild 
restraint  on  N.  Consider  the  case  when  all  the  weights 
are  about  the  same  size,  then  M  =  2N.  Here.  N  =  4 
would  lead  to  about  12^  /o  distortion,  so  that  N  =  4  is 
a  practical  lower  limit  on  the  value  of  N.  More 
generally,  for  adaptive  weight  cases.  N  =  8  is  a 
practical  lower  limit  on  the  value  of  N. 

^or  the  case  of  constant  frequency  signal  and 
iterference.  the  magnitude  of  the  signal  distortion 
arm  depends  on  the  frequency  difference  between 
signal  and  interferer  and  the  sample  rate.  We  exploit 
the  fact  that  the  signal  vector  rotates  around  the 
interferer  vector  at  a  fixed  frequency.  The  geometry 
governing  this  situation  is  shown  in  figure  1 .  The 
analysis  makes  use  of  the  coordinate  system  defined 
relative  to  the  interferer  vector.  Observe  that 
projz^^Sj^k)  =^proju^,(sj+k),  where 

Uj^k  =  R-k/A^j+k)  &r\dSj+k  =  R-k/Asj+k),  where /„  is 
the  ratio  of  the  frequency  of  the  interferer  to  the 
sampling  frequency  and  /?v(m)  is  a  rotation  of  u 
through  \|r  radians.  Next,  observe  that  Uj+k  =  uj  so 
that  the  projections  resulting  from  the  vectors 
Sj^t  onto  Uj  are  identical  to  those  of  sj+k  onto  Uj+k- 
Let  \(r denote  the  angle  between  Sj  and  Uj.  Then 

projzjisj)--:^  Z  proJzjJsj^k)  = 

aVV  k=>-NJM 

IjIcosv-—  X  proji,  (Sj^w). 

k=~NJc^ 

Note  that 

proji^^isj^w)  =  \s\  cosik(fs-fu)  +  yv). 

as  seen  from  figure  2,  where  /^is  the  ratio  of  the 
signal  frequency  to  the  sampling  frequency.  It  follows 
that 

N 

2  proji,  (sj^k\\)  = 

k=^NJc*0 

N 

Z  l5lcOS(^(/'s-/«)  +  V)- 

k^NJctO 

This  last  sum  can  be  viewed  as  an  approximation  to 
an  integral  after  adding  back  in  the  k  =  0  term.  Let 
Af=fs  -fu  and  suppose  that  (2N+  1  )Af  <  1 .  Then 

cos  Vjr  +  I.k-=-NjM  cos(,k(J's  -fu)  +  V) 

=  cosit Af+y\f)dt 

=  -^[sin(iVA/+  v)  -  sin{-A^A/+  v)] 


=  ■^(siniVA/)cosv. 

Thus 

proj.fsj)--;^  Z  projz^fSj^k) 

k=^NJc*0 

=  l5lcos\|/+  -■^(sinA^A/)Ulcosy 
=projzpj)i  1  +  ^  -  7^(sin  NAf)). 

It  follows  that  N  should  be  taken  large  enough  so  that 
the  term  involving  sin  NAj  can  be  neglected.  In 
general.  N  should  be  chosen  at  least  as  large  as  the 
sampling  frequency  divided  by  the  difference  of  the 
signal  and  interferer  frequencies,  i.e..  as  the  frequency 
resolution  of  the  spectral  analysis. 

For  circularly  distributed  Gaussian  noise,  the  variance 
of  the  Gaussian  noise  in  the  output  of  the  processing 
algorithm  under  discussion  is 

(1  +2M-^)^)var(n)  =  (1  +  -^)varin) 

with  varin)  equal  to  the  variance  of  the  real  and 
imaginary  components  of  the  Gaussian  noise 
component  of  the  received  baseband  samples. 

One-state  Phase  Model. 

Suppose  that  the  interference  has  linear  phase  (that 
is,  it  has  constant  frequency)  and  the  signal  and 
Gaussian  noise  are  uncorrelated.  Consider  the 
preprocessing  step  to  remove  the  phase  of  the 
interferer  phase  contribution 

>  A*0y  =  0y  —  Y(0y-*  +  0/f*) 

followed  by 

,  N  ,  N 


ZA*ey  =  0;- 


2i\l  ^ 
k=lj[*0 


The  signal  term  in  this  equation  involves  projections 
onto  iz  divided  by  A.  For  phase  processing,  the 
signal  and  Gaussian  noise  after  this  processing 

depends  on  the  values  of  the  ratios  —  as  well  as  N. 

Aj+k 

The  signal  term  in  the  output  of  the  algorithm  is 
approximately 


projizjiSj)  1 
Aj  2N 


Projiz^fsj^k) 


The  signal  distortion  term  is  expected  to  be  small 
when  the  signal-to-interferer  relative  phase  is  random. 
The  random  walk  argument  used  for  the  amplitude 


case  still  applies  provided  that  the  value  of 


is 


uncorrelated  with  the  relative  phase  \|/y .  The  distortion 
estimate  using  the  random  walk  is  as  before  except 
that  the  average  step  is  multiplied  by  the  average 

A I 

value  of  the  absolute  values  of  the  ratios  — — . 

Aj+k 


The  linear  relative  phase  case,  that  is  the  case  for 
narrowband  signal  and  interference,  poses  some 
difficulties  when  the  amplitude  of  the  interferer  varies 
from  sample  to  sample.  However,  the  case  when  the 
interferer  amplitude  is  a  constant  follows  as  before  by 
replacing  u  by  iu  in  the  argument.  In  particular,  we 
obtain  the  approximation 

proja^  (Sj)  -  ^  Z  projz^,  (Sj^) 

k=-NJrtO 

The  Gaussian  noise  variance  of  the  one-state  phase 
algorithm  for  sample  j  is 

11 +(^)'  i  (-^)nvar(n). 

k=>-NJ(*0 

This  reduces  to  the  estimate  of  the  variance  of  the 
Gaussian  noise  for  the  one-state  amplitude  algorithm 
in  a  constant  amplitude  interferer. 


This  example  shows  that  even  though  the  original 
samples  had  uniform  phase  and  the  processing  of 
them  under  the  assumption  of  independent  samples 
led  to  no  processing  gain,  a  preprocessing  step  to 
remove  the  correlation  of  the  interference  phases  and 
then  adaptive  locally  optimum  processing  (here  trivial, 
but  in  multistate  model  state  to  be  treated  next,  not 
so)  can  lead  to  considerable  processing  gains. 

One-state  Model  Algorithm  Performance. 


For  a  constant  amplitude,  constant  frequency 
interferer  with  the  signal  and  noise  satisfying  the 
assumptions  above,  the  signal  can  be  reconstructed 
by  forming: 


Z  Aj,k\ 

k=^N.ktO 


+16,-^  i 

If  the  signal  is  of  known  structure,  this  preprocessing 
step  is  followed  by  taking  the  real  part  of  this  quantity 

after  muttiplication  by  — r.  If  the  signal  is  of 
Uy  I 

unknown  structure,  this  preprocessing  step  is  followed 
by  taking  the  norm  of  the  quantity. 

For  a  known  signal,  the  amplitude  and  phase  terms 
provide  the  projections  of  the  signal  onto  z  and  iz. 

The  signal  is  reconstructed  with  distortion  depending 
on  N,  small  if  N  is  large.  This  is  important  for 
urrdersea  surveillance  applications  because  it  means 
that  clues  used  to  classify  signals  are  preserved  by 
the  processing. 

The  variance  of  the  Gaussian  noise  increases  from 
2o^  to  2(7^  (1  -f-  ■^)  as  a  result  of  the  processing. 

This  is  the  minimal  variance  for  any  filter  of  the 
complex  samples  using  2N  weights  whose  sum  is  1 . 
Therefore,  the  one-state  processing  represents  a 
nonadaptive  implementation  of  a  Wiener  filter.  See 
Bond  (1992)  for  a  more  general  treatment  of  the 
relationship  between  adaptive  locally  optimum 
processing  and  adaptive  Wiener  filtering. 

If  only  the  one-state  amplitude  algorithm  or  the 
one-state  phase  algorithm  is  effective,  the  insertion 
loss,  the  difference  in  dB  between  the  output 
signal-to-Gaussian  noise  and  input  signal-to-Gaussian 
noise,  for  the  algorithm  used  is  about  3  dB.  because 
the  expected  signal  term  energy  is  reduced  by  a  factor 
of  four  while  the  Gaussian  noise  variance  is  reduced 
by  a  factor  of  two. 

The  expected  output  signal-to-noise  ratio  can  be 
calculated  for  the  second-order  one-state  detector 
A  D2{A)  =  {A -\iy  by  using  the  expansion  of  the 
baseband  samples  into  the  amplitude  of  the  interferer 
plus  the  projections  of  the  signal  and  Gaussian  noise 
onto  the  interferer; 

En^siDiiA))  =  E„^,[{proju(s)  proj  u{n))^] 

=  E„:,s[projl{s)]  -I-  E„^s\projl(n)\ 

under  the  assumption  that  projuis)  =  bl  cos<{)  and 
projuin)  =  In  I  cosvj/  with  (J)  and  Vjf  independent 
random  variables  whose  values  are  uncorrelated  with 
bl  and  Ini.  Also,  E„(D2(A))  =  E„[projl{n))] 
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so  that 

E„+s(D2{A))-E„{Di(A))=E„\projl{s))\  =  jLvP. 


Note  that 

al{D2iA))  =  £(/>ro/2(n)]=£[l«|‘*]£[cos'*y]  =  3a'‘ 


using  E[\n\‘*]  =  E[inl  +  nj)^] 

=  E[n*  +  2nlnj  +ny]  =  Sa"*,  where  and  riy  are 
the  real  and  imaginary  components  of  the  Gaussian 
noise  n  and  assuming  the  relative  phase  \\f  is 
uniformly  distributed.  It  follows  that  the  signal-to> 
noise  ratio  after  the  processing  is  approximately 
1 


2J3 


where  is  the  variance  of  the  real  and 


imaginary  components  of  the  Gaussian  noise 
component  of  the  received  signal. 


Consider  estimating  the  expected  preprocessor  output 
signal-to-noise  ratio  for  the  one-state  symmetric 
phase-difference  model  in  a  similar  way  to  that  for 
amplitude  one-state  model.  For  the  second-order 
detector, 

AG  Z)2(A0)yf  2  =  (A0  -  X^A  2 
under  the  assumption  that  ^  and  \|/  are  independent 
random  variables  whose  values  are 
uncorrelated  with  IjI  and  l/il, 

En^s(D2iAQj)A])-E„iD2iAQj)Aj) 

=  £„[(A(pro/„i(j))2], 

where 


A(proj^xis))  =  \sj\  cos 


I  COS<|);-i  + ' 


2' A 


/-I 


A, 


>1 


<t>y 

l-sy+i  I  cos(j)^i ). 


In  general,  the  expectation  on  the  right-hand  side  of 
the  equation  depends  on  signal  levels  and  baseband 
sampie  amplitude  correlation  properties  as  well  as  the 
relative  phase  correlation  properties  for  the  three 
successive  samples  j-1 ,  j,  and  j+1 .  In  practice,  the 
fact  that  the  signal  component  can  be  small  in  this 
expression  limits  the  use  of  the  symmetric  phase 
difference  algorithm. 


A  A 

Assume  that  -r^  =  1  and  -7-^  =  1 , 

^  •  A^  ■ 


Uy-i  I  =  b;!,  by+1 1  =  bjl,  and  the  relative  phases  are 
independent  random  and  uniformly  distributed.  Then 

E„[(A(prq/^i(s))^]  = 

f"  f"  f"  .  ...  <^osV|/-l-cos(p^2  ,,  ,  , 

X  (cos(j) - ^ - —rd(l>cAi/d(p 

J  — Ti  J  — tt  J  —n  2. 
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A  A 

Assume  that  =  1  and  =  1 .  Treating  the 

three  Gaussian  noise  terms  in  the  detector  as 
independent,  identically  distributed,  and  independent 
of  coordinate  system,  it  follows  that 

a2(D2(A0))  =  £[(/i,-i(n,-FH,))^] 
with  rix,  hx  and  hx  zero-mean  random  variables  with 
variance  Cj2 .  Note  that 

E[{nx-\{hx-^nx))*]  = 

£(/i^)-F-|£[n2(n2  +h2)]-*-jL£[^J  -1-6/12^2  +n^] 

=  l3  +  f  +  f  +  -i7(3  +  6-F3)]a^  =  ^a^ 

It  follows  that  for  this  special  case  the  expected 
signal-to-noise  ratio  for  the  symmetric  phase- 
difference  one-state  second-order  detector  is 

2  b|2 

approximately  — = — r--  f^or  most  cases,  this  is 
3^3 

probably  the  best  that  can  be  expected  from  the  phase 
processing. 

The  insertion  loss  for  the  second-order  amplitude 
algorithm  alone  is  5.4  dB  and  for  the  second-order 
symmetric  phase-difference  algorithm  is  4.1  dB. 
Combining  the  results  should  lead  to  improved  line 
detection  over  either  alone  by  producing  a  better 
defined  line,  but  it  is  not  clear  how  to  calculate  the 
signal-to-noise  ratio  associated  with  this  noncoherent 
combining  process. 

Multistate  Models  for  Interference  Amplitudes  and 
Phases. 


Gaussian  mixture  models  have  been  proposed  for 
modeling  ship-generated  acoustic  noise.  Gaussian 
kernel  representations  of  probability  density  function 
have  been  used  to  model  jamming  plus  background 
noise  for  communications  applications.  Recall  that  the 
Gaussian  mixture  model  arises  from  modeling  the 
interference  plus  background  noise  of  the  real  and 
imaginary  components  of  the  baseband  samples  as 
independent  identical  zero-mean  Gaussian 
distributions.  Interference  states  are  distinguished  by 
the  variance  of  the  Gaussian  interference  plus  noise. 
The  probability  density  function  for  a  complex  sample 
is  obtained  by  treating  the  in-phase  and  quadrature 
samples  as  independent.  We  introduce  another 
mixture  model  related  to  kernel  representations  of 
probability  density  functions.  This  model  involves 
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distributions  with  nonzero  means  and  we  will  call  it  a 
rvsncentral  mixture  model. 

Different  noncentral  mixture  rruxtels  are  used  to  model 
amplitude  and  phase  or  symmetric  phase  differences. 
In  this  context,  the  sample  amplitudes  and  phases  are 
treated  as  independent  and  each  is  then  modeled  by  a 
noncentral  mixture  model.  For  these  models,  the 
interference-plus-noise  samples  are  assumed  to 
contain  deterministic  and  random  components.  The 
states  of  the  model  are  determined  by  the 
deterministic  interference  with  the  random  component 
treated  as  stationary  for  modeling.  For  a  noncentral 
mixture  model  of  sample  amplitudes,  the  deterministic 
interferer  is  assumed  to  take  on  discrete  values  of 
amplitude  that  define  the  states  of  the  model.  The 
probability  density  function  of  the  amplitudes  of  the 
samples  in  a  state  is  a  noncentral  truncated  Gaussian 
distribution  with  variance  after  truncation  equal  to  the 
variance  of  the  background  noise. 

Figure  3  presents  scatter  plots  for  Gaussian  noise 
and  a  Gaussian  noise-plus-CW  interferer.  The 
Gaussian  noise  case  represents  the  scatter  of 
complex  samples  for  a  typical  Gaussian  mixture  model 
state,  while  the  Gaussian  noise-plus-CW  case 
represents  the  scatter  plot  of  complex  samples  for  a 
typical  rK)ncentral  mixture  model  state.  Observe  that 
the  scatter  plots  presented  in  the  companion  report  to 
this  report  ."Gaussian  Mixture  Models  for  Undersea 
Surveillance,"  for  the  MDA  hydrophone  data  are 
closer  in  appearance  to  that  of  Gaussian  noise  than 
Gaussian  noise  plus  CW  interferer.  Nevertheless,  it 
seems  reasonable  to  suppose  that  under  some 
conditions  acoustic  inference  might  be  better  modeled 
by  a  noncentral  model  than  by  a  Gaussian  mixture 
model,  and  this  is  one  reason  the  theory  of  noncentral 
mixture  models  is  discussed.  For  communication 
applications,  interference  is  often  better  modeled  by 
noncentral  mixture  models  than  by  central  mixture 
models.  In  any  case,  the  discussion  of  noncentral 
mixture  models  together  with  Gaussian  mixture 
nrxxfels  provides  additional  insight  into  the  theory  of 
adaptive  locally  optimum  processing  for  ocean  basin 
surveillance  and  for  communications. 

The  states  of  a  Gaussian  mixture  model  have 
probability  density 


Pk(.x,y)=p{x)p{y) 

t*  A* 

e  )(  Tzi  e  )=  —^(e  ). 

J2k  c* 

The  corresponding  density  functions  for  A  and  6  are 
■» 

A^- 

Pti-^)  =  -^(e  )  and  ^(0)  =  for  -n  <  0  <  n, 

oj  2n 

which  satisfy  p*(x,v)  =  ^^v--fl(0) . 

A 

The  probability  density  function  of  the  sample 
amplitudes  in  the  k-th  state  for  a  noncentral  mixture 
model  has  probability  density  function 

Pk(A)=  --  g  ^0=  , 

J2k  a 

where  is  the  mean  value  of  the  amplitudes  of  the 
samples  in  the  k-th  state.  The  fact  that 
A  >0  imposes  a  constraint  on  the  Gaussian 
component  of  the  interference  in  the  k-th  state,  namely 
that  Gn  <  This  means  that  the  noncentral  mixture 
model  is  only  applicable  when  the  small  signal 
hypothesis  applies  for  all  the  discrete  values  of  the 
interference. 

The  probability  density  funetbn  for  the  amplitudes  of 
the  samples  in  an  S  state  mixture  model,  Gaussian  or 
noncentral,  has  the  form 

p{A)  =  'ZpkiA)pk, 
k=l 

where  Pk{A)  is  the  probability  density  function  of  the 
amplitudes  of  the  samples  in  the  k-th  state 
and  pk  is  the  probability  that  a  sample  is  in  the  k-th 
state. 


=  (■ 


For  a  noncentral  mixture  model,  the  first-order 
detector  for  sample  j  is 


^~p{Aj)  A/ 

=  I-i  IL^Aj  -  Ji*)w(/,  k)  + 


1  Rejzjs]] 
Aj  IZ/lb;! 


and  the  second-order  detector  for  sample  j  is 
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P"(^i)  1  (  1 

p{Aj)  Aj  p(Aj)  Aj 


uniformly  distributed.  This  is  equivalent  to  modeling 
the  interference  as  having  linear  phase  at  some  times 
and  random  phase  at  other  times  Another  option  is  to 
model  symmetric  phase  differences  by  a  noncentral 
mixture  model.  In  this  case,  the  deterministic  interferer 
is  assumed  to  take  on  discrete  values  of  phase 
difference.  These  values  determine  the  states  of  the 
model  and  as  a  result  the  model  would  have  similar 
structure  to  that  of  a  noncentral  mixture  model  for 
amplitudes. 

A  noncentral  mixture  model  for  the  probability  density 
function  of  the  sample  symmetric  phase  differences  in 
the  k-th  state  has  probability  density  function 


For  a  Gaussian  mixture  model,  the  first-order  detector 
is 


p'jAj)  Re[zjs]] 
Pi^j)  IZyll^yl 


Re[ZjSj] 

u;r 


and  the  second-order  detector  is 


P"iAj)  J.  A] 

PiAj)  at 


when  the  first-order  detector  vanishes.  In  either  case, 
the  weights  are 


^ij,k)=pk 


p*(Ae)  =  —  e~  2-'  . 

o 

where  |i*  is  the  mean  value  of  the  symmetric  phase 
differences  of  the  samples  in  the  k-th  state. 

Note  that,  treating  the  projections  of  ttie  Gaussian 
noise  onto  the  interferer  vector  rotated 
counterclockwise  by  90°  as  Gaussian  is  an 

projiuH 

approximation,  which  only  makes  sense  if  - — —  is 

A 

small  relative  to  (}),  i.e.,  that  the  small  signal 
hypothesis  applies  to  the  Gaussian  noise  term  relative 
to  the  interference  as  well  as  to  the  signal  term  relative 
to  the  interference. 

A  Gaussian  mixture  model  for  the  probability  density 
function  of  the  sample  symmetric  phase  differences  in 
the  k-th  state  has  probability  density  function 

p*(A0)  =  -— — e  20*. 

J2n  a* 


The  first-order  and  second-order  Gaussian  mixture 
model  detectors  can  also  be  viewed  as  functions  of 
the  complex  sample  norms.  This  was  the  viewpoint 
for  the  original  derivation  of  the  detectors  by  Stein, 
Bond,  and  Zeidler  (1993). 

Mixture  models  for  phase  are  not  discussed,  since 
rarely  would  such  models  provide  significant 
performance  other  than  for  a  narrowband  interferer 
received  with  linear  phase,  which  can  better  be 
modeled  by  a  one-state  model.  A  Gaussian  mixture 
model  for  symmetric  phase  differences  could  model 
the  symmetric  phase  differences  of  the  interferer 
component  of  the  interference  plus  noise  as  zero  or 


The  probability  density  function  of  symmetric  phase 
differences  in  an  S  state  mixture  model,  either 
Gaussian  or  noncentral,  has  the  form 

P(A0)  =  Zp-t(Ae)p*. 

*=i 

where  p*(A0)  is  the  probability  density  function  of  the 
symmetric  phase  differences  of  the 
samples  in  the  k-th  state  and  p*  is  the  probability  that 
a  sample  is  in  the  k-th  state. 

For  a  noncentral  mixture  model  the  first-order  detector 
for  symmetric  phase  differences  for  sample  j  is 


^  P(A0;)  ^  \zj\Asj\ 

1  ^  Re[iZjS 

=  (-^  ZCAGy  -  ^*)w(/,  k))  ■  ^ 

o  *=i  Iz/Pk 


and  the  second-order  detector  for  sample  j  is 
p^^(Aey)  1 

V  _  /  A  A  \  ' 


pm)  lz;|2 
1  4,,(de;-n,)> 


=  '^21 


*=l 


a2 


-l]w(/U))- 


iz, 


.2 


with 


w{j,k)=pr 


e  20^ 


^L\Phe 

For  a  Gaussian  mixture  model, 

p' (AQ j)Re[iZjS*] 
^  piAQj)  \zj\Asj\ 
rV  1  r- 


*=i  <5l 


\zj\Asj 


and 


P^^(A0/)  1  ^  A0y  1  1 

with 


w{j,k)=pk- 


£i.f,  2oJ 
^A=i  a*  ^ 


For  all  the  mixture  models  discussed  for  amplitude  and 
symmetric  phase  differences,  the  weights  satisfy  the 
conditions 


The  detectors  use  a  fuzzy  set  model  (Klir  and  Folger, 
1988)  of  sample  amplitudes  and  symmetric  phase 
differences.  Each  sample  z;  is  assumed  to  be  a 
member  of  one  and  only  one  state  of  the  model  as 
shown  in  figure  4.  The  model  is  constructed  using 
one  set  of  baseband  samples  to  represent 
interference.  The  model  is  then  used  to  process 
another  (possibly,  the  same)  set  of  baseband  samples 
to  detect  a  signal  in  the  presence  of  the  interference. 
Due  to  the  presence  of  Gaussian  noise,  it  is  desirable 
to  assign  set  membership  using  probabilities.  In  this 
sense,  the  locally  optimum  processing  algorithms 
involve  fuzzy  set  modeling. 

All  the  detectors  described  in  this  subsection  can  be 
implemented  in  two  ways.  One  way  is  to  assume  a 
given  model  and  estimate  the  parameters  of  the 
model.  The  other  way  is  to  correlate  an  empirical 
distribution  of  the  interference  samples  with  a 
predetermined  family  of  distributions  with  known 
parameters  and  choose  the  distribution  that  has 
highest  correlation  with  the  empirical  distribution  as  a 
model  of  the  interference.  The  samples  used  to 
construct  the  interference  model  in  either  of  these 
ways  could  be  drawn  from  frequency  bins 
adjacent  to  the  bin  or  two  adjacent  bins  to  the  bin 
containing  the  sample  processed  for  signal.  In  this 
manner,  the  samples  used  to  construct  the  model  are 
signal  free.  A  third  way  to  obtain  detectors  involves 
implicit  multistate  modeling  of  the  interference.  The 
motivation  for  appr;jach  is  the  NSE  algorithm  used  by 
several  undersea  surveillance  systems. 

Implicit  Models. 


Noise  Equalization  Model. 

The  NSE  algorithm  is  closely  related  to  adaptive 
locally  optimum  processing.  The  following  discussion 
provides  motivation  for  adaptive  locally  optimum 
processing  algorithms  obtained  through  use  of  implicit 
models. 


(a)  w(/, A:)  >  0,  k^O  and 

(b)  w(J,k)  =  1 . 


Furthermore,  the  weights  can  be  interpreted  as  the 
probability  that  the  j-th  sample  belongs  to  the 
k-th  state  modeling  the  interference. 


The  quantity  appearing  in  the  Gaussian  mixture 
algorithms  has  the  form 


which  can  be  written  as 

lz;P-2a^  o2 
^  -.2  ^2  ■ 
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The  quantity  in  brackets,  used  in  the  NSE  algorithm,  is 
the  predict^  signal  energy  normalized  by  the 
predicted  noise  variance.  The  NSE  algorithm  can  be 
interpreted  using  a  mixture  model  in  the  following  way. 
The  states  are  defined  by  a  prescribed  set  of  possible 
variances,  a, ,  Oj, ... ,  of  the  broadband  background 
noise.  That  is,  the  k-th  state  has  variance  Then 
the  implicit  model  is  that  all  the  samples  in  a  given 
state  has  independent  real  and  imaginary  components 
with  zero  mean  Gaussian  distributions  with  variances 
equal  to  the  variance  for  the  state.  Thus  the 

probability  density  function  for  state  k  is 

ui^ 

zl^) - 

2no; 

Observe  that  piklzj)  =  0  or  1  is  implicitly  assumed  in 
the  NSE  algorithm.  In  other  words,  a  sample  is 
assigned  to  one  and  only  one  state  for  the  NSE 
algorithm  in  contrast  to  the  probabilistic  assignments 
for  the  Gaussian  mixture  model.  Thus  the  NSE 
algorithm  involves  sets  rather  than  fuzzy  sets. 


The  results  of  the  SOSUS  noise  equalization  algorithm 
are  presented  to  a  human  on  a  LOFARGRAM  using  a 
grey  scale  whose  intensity  is  proportional  to  the 


logarithm  base  2  of 


2ot 


The  logarithmic 


scale  is  used  to  match  the  displayed  grey  scale  to  the 
response  characteristics  of  the  human  eye.  For 
broadband  noise  changing  slowly  relative  to  the  FFT 

<jI„ 

samples  duration,  the  omission  of  the  factor  — — 


probably  has  little  impact  on  eye  integration. 


to  implement  and  have  proven  particularly  effective  for 
communication  applications  when  the  interference  is 
non-stationary. 

Probability  density  estimation  has  been  widely  studied 
by  mathematical  statisticians  during  the  last  20  years 
(Silverman,  1986).  One  powerful  and  efficient 
approach  is  to  represent  the  probability  density 
function  as  a  sum  of  Gaussian  kernels  defined  by  the 
discrete  samples.  Gaussian  kernels  can  be  used  to 
recursively  implement  adaptive  locally  optimum 
processing  algorithms  in  the  following  way. 

Suppose  that  Xj  is  to  be  processed  using  the 
samples  and  their 

statistics.  The  constant  N  is  chosen  depending  on  the 
stationarity  of  the  interference  and  the  sample  rate. 
Typical  values  of  N  are  between  8  and  32  when 
sampling  at  the  Nyquist  rate.  The  Gaussian  kernel 
estimate  of  the  probability  density  function  is 

IX-X 

1  ^ - - 

piX)  =  -f= — (  E  e  ),  where 

J2n  Cj  k=~N,  *5(0 

cj  is  the  variance  of  {Xj+t  \-N^k<N,k^0}. 
Given  this  representation,  the  probability  density 
quotient  involved  in  the  first-order  detector  for  a  signal 
of  known  structure  takes  the  particularly  simple  form 

_ 

e 


Implicit  Noncentral  Mixture  Models. 

The  adaptive  locally  optimum  processing  techniques 
developed  for  communication  applications  arise  out  of 
an  implicit  representation  of  a  noncentral  mixture 
model.  Recall  that  the  assumption  of  independent 
amplitude  and  phase  leads  to  parallel  processing  of 
baseband  sample  amplitudes  and  phases. 

Different  adaptive  locally  optimum  processing 
algorithms  arise  from  the  different  ways  of  estimating 
the  true  probability  density  function  from  the  received 
signal  statistics.  Many  of  these  algorithms  result  from 
an  implicit  representation  of  a  noncentral  multistate 
mixture  model.  The  resulting  algorithms  are  practical 


=  -T 

<5j  k=-N 


where 


w(j,k)  = 


1.1 


2o‘ 


2o: 


The  probability  density  quotient  involved  in  the 
second-  order  detector  for  a  signal  of  unknown 
structure,  given  that  the  first-order  detector  has 
expected  value  0,  takes  the  form 
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r 

p  jX) 

p(X) 


\x=x, 


__2i _ L 

'^’L^NJctO  cj 

=  -T 

oy  lc=-N  GJ 


where  k(jj  +  k)  =  e  and  with  the  weights 

w(j,  k)  defined  as  they  were  for  the  first-order 
detector. 


For  undersea  surveillance  applications,  it  is  often 
desirable  to  process  the  sample  Xj  with  noise-only 
samples.  However,  the  samples 
{Ay+* \-N<k<N,k^  0}  would  in  general  contain 
signal  and  noise.  When  processing  frequency  domain 
data  for  narrowband  signals,  we  can  use  samples 
{Ay+*  \-N<k<N,k^0}  from  adjacent  frequency 
and  time  bins  to  the  one  containing  the  sample  to  be 
processed.  The  resulting  algorithms  involve  implicit 
modeling  of  the  interference  statisticsJorA' = ^  and 
X=AQ,  respectively.  Each  sample  Xj+i  defines  a 
state  with  |iy+*  and  variance  gJ.  The 
probability  density  function  of  the  samples  in  a  state 
j+k  is  the  Gaussian  kernel 


Pj^{X)  = 


The  weight  w(j,  k)  is  the  membership  function  for  the 
sample  Xj  in  state  j+k  viewed  as  a  fuzzy  set  as 
described  earlier  and  illustrated  in  figure  4.  The 
weights  w(j\  k)  also  have  a  simple  algebraic 
interpretation  for  the  first-order  detector.  Let 
w(jj)  =  0.  Observe  that  w(/,  ^)  >  0  for^  j  and  that 
fj  w(J, h)=l.  These  conditions  exhibit  the 
calculation  of  the  gain  factor  term  as  a  filtering 
operation  (Bond  and  Hui,  to  appear). 


Implicit  Gaussian  Mixture  Models. 


The  general  technique  of  probability  density 
estimation  by  Gaussian  kernels  has  to  be  modified  to 
allow  nonparametric  estimation  of  Gaussian  mixture 


models.  As  before,  the  statistics  are  estimated 
recursively.  Suppose  that  zj  is  to  be  transformed 
using  interference  statistics  estimated  from  the 
samples {Zy+* l-A^<^<iV,^^0}.  Consider 
estimation  of  the  probability  density  function  by 

pi\z\)  =  ^i  Z  G>*(lzl)), 

h^NJi*0 

_-ML 

where  G;+*(lzl)  =  — e  with  =  lzy+*P. 

Note  that  the  above  estimation  is  precisely  the 
Gaussian  kernel  approximation  where  the  parameters 
are  the  variances  of  the  different  states.  We  then 
arrive  at  the  above  form  by  observing  that  for 
z  in  state  it,  Izl^  is  an  unbiased  estimator  of  a*.  To 
avoid  numerical  difficulties  in  the  actual 
implementation,  samples  with  energy  less  than  a 
prescribed  level  are  not  used  in  building  the  implicit 
model. 


The  first-order  detector,  that  is  the  detector  for  signal 
of  known  structure,  for  a  nonparametric  representation 
of  the  Gaussian  mixture  model  is 


1  i 


h=-NJc*0 


\sj\ 


while  the  second-order  detector,  that  is  the  detector 
for  signal  of  unknown  structure,  is 


N 


Iz  |2 


k=-NJc*0  G 


-]GM\zj\), 


’/+* 


V* 


where  in  either  case 


G^t(lzl)  =  -— 


G>.*(lzl) 


2JV  ^lt=-NJc*0 


G^*(lzl)’ 


The  technique  of  probability  density  estimation  that 
uses  Gaussian  kernels  has  to  be  modified  slightly  to 
allow  nonparametric  estimation  of  Gaussian  mixture 
model  for  symmetric  phase  differences.  As  for 
squared  norms,  the  statistics  are  estimated 
recursively.  Suppose  that  Adj  is  to  be  transformed 
using  interference  statistics  estimated  from  the 
samples  { A0;+t  l-JV<k<N,k^0}.  Estimate  the 
probability  density  function  by 
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piAQ)  =  ^i  I  Gj^uim) 

h>-NJr*0 

where 

Ae^ 

G^*(A0)  =  -— ] - e 

V  271  O^* 

with 


To  avoid  numerical  difficulties,  samples  with 
symmetric  phase  difference  less  than  a  prescribed 
value  are  not  used  in  building  the  implicit  model  for  the 
symmetric  phase  difference  Gaussian  mixture  model. 


The  first-order  detector,  that  is  the  detector  for  signal 
of  known  structure,  for  this  nonparametric 
representation  of  the  Gaussian  mixture  model  is 


N 

I  z 

Jt=-NJM 


GMAQj) 


]AQRe[iZj-^], 


while  the  second-order  detector,  that  is  the  detector 
for  signal  of  unknown  structure,  is 


N 

s 


A0,^ 


Ip>-NJc*0  O 

where  in  either  case 
G^(A0)  = 


-^]G^(A0,), 

*J+k 

GMAQ) 

Gj+h(AQ) 


The  probability  density  function  for  a  Gaussian  mixture 
of  amplitudes  or  of  symmetric  phase  differences  is 
modeled  for  sample  zj  as  a  2N-  state  model  with  the 
states  all  equally  likely  so  that  Pk  =  -^  for 
-N<k<N,k*0.  The k-th  kernel  G/+*(lz/l)  or 
G/+*(A0y)  is  the  probability  density  function  for  the 
k-th  state.  For  the  amplitude  case,  this  probability 
density  function  can  arise  from  zero-mean  Gaussian 
probability  density  functions  of  equal  variance  for  the 
real  and  imaginary  components  of  the  sample.  In  this 
context,  the  variance  of  these  components  is  simply 
\zj^k\^.  Furthermore,  if  we  assume  that 
p(z\k)  =  G/+*(z)  then  w{j,k)  becomes  the  probability 
that  \zj\  belongs  to  the  k-th  state.  For  the  symmetric 
phase  difference  case,  the  component  variance  is 
A0^ .  In  either  case,  the  nonparametric  estimation  of 


a  Gaussian  mixture  model  leads  to  a  fuzzy  set  model 
for  the  resulting  detectors  as  for  the  kernel  algorithms. 


Useful  information  on  the  way  to  nonparametrically 
estimate  a  Gaussian  mixture  model  is  obtained  by 
considering 


N 


Z  Gy4*(lzl) 


when  the  samples  z  j+t  are  independent  and 
identically  distributed  with  probability  density  function 


Then 


2ot 
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A=i  a: 


N 


Z  Gj^i\z\) 

N-y-^lN  k=-nJciO 


=  — ff 

412  J  >. 


>  2U2  ^ 

lw|2  *=1  c] 


dwidwi 


2ic  Jo  r  ^2 

Using  the  substitution  u  =  r^  and  interchanging  the 
order  of  integration  and  summation,  vye  obtain 


2of 


)dr 


where  Koi^)  is  the  modified  Bessel  function  of  the 
a* 

second»kind  order  0  (Gradshyteyn  and  Ryshik,  1980). 
In  an  analogous  manner, 


n^2N  2k 


for 


IA0| 

Ok 


P(A0)  = 


h=i 


z 

^  Ok 


In  either  case,  the  use  of  the  nonparametric 
techniques  transforms 
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•  Ixl 

e  intoA'„{-^)  Figure  5  shows  the  relationship 

between  these  two  functions.  Both  the  Bessel 
function  and  Gaussian  function  are  symmetric  and  so 
figure  5  shows  the  functions  for  non-negative  real 
numbers.  In  addition,  they  have  been  chosen  to  have 
unit  area  for  the  whole  real  axis.  Since  the  Bessel 
function  /Co  tends  to  infinity  as  the  argument  tends  to 
0,  we  need  to  modify  the  states  for  the  samples  with 
low  values  to  obtain  practical  algorithms.  The 
simplest  approach  is  to  rank  order  the  samples  from 
lowest  to  highest  norm  and  not  use  the  lowest  10%  of 
the  samples  to  construct  the  implicit  model  approach 
is  to  rank  order  the  samples  from  lowest  to  highest 
norm  and  not  use  the  lowest  10%  of  the  samples  to 
construct  the  implicit  model. 


5 

p{A)  =  -^ - 'LPke  <  ,A>Q. 

J2nCb„  *=i 

We  assume  that  |i*  »  a  bn  for  1  <k<S.  It  then 
follows  that  j'^p(A)dA  =l°^p{A )dA .  We  also 
assume  that  the  phase  is  uniformly  distributed.  In 
addition,  let  the  signal  be  of  known  structure,  that  is 

s  =  (a,b)=a  +  ib  and  -j— r  is  known. 


For  the  classical  correlator. 


c{x,y) 
and  we  have 


ax  +  by 

nr 


Processing  Gain  for  Muttistate  Model  Detectors. 


In  this  section,  we  obtain  upper  bounds  and  estimates 
for  the  square  of  the  ratios  of  the  deflection  of  adaptive 
locally  optimum  detectors  to  the  traditional  detectors. 
We  call  this  squared  ratio  the  processing  gain  for  the 
first-order  adaptive  locally  optimum  processing 
algorithm  relative  to  the  traditional  processing 
algorithm  and  ratio  of  deflections  processing  gain  for 
second-order  detectors.  We  obtain  upper  bounds  for 
processing  gain  for  noncentral  mixture  models  of 
amplitude  and  symmetric  phase  differences  for  signals 
of  known  and  unknown  structure  and  Gaussian 
mixture  models  of  amplitudes  for  signals  of  known  and 
unknown  structure.  We  have  also  obtained  estimates 
of  processing  gain  for  the  Gaussian  mixture  models  of 
amplitude  and  conducted  simulations  to  validate  the 
bounds  and  estimates.  These  results  relate 
achievable  processing  gain  to  mixture  model 
parameters  and  provide  a  framework  for  assessing  the 
potential  processing  gains  achievable  from  the  use  of 
adaptive  locally  optimum  processing  instead  of 
traditional  processing.  Also,  note  that  it  is 
unnecessary  to  obtain  Gaussian  mixture  model 
symmetric  phase  difference  processing  gain  bounds 
and  estimates  because  this  case  is  not  of  interest  for 
ocean  basin  surveillance. 

Processing  Gain  for  a  Noncentral  Mixture  Model 
First-order  Detector. 


Let  the  probability  density  function  of  the  amplitudes  of 
the  noise  be 


E„ic{x,y))  =  0  and  £,+„(c(x,y))  =  bl . 

The  variance  contains  a  broadband  noise  component 
al„ ,  the  sum  of  the  variances  of  the  inphase  and 
quadrature  components  of  the  background  noise,  and 
a  component  contributed  by  the  interferer  with 
different  discrete  values  of  amplitude.  Under  the 
assumption  that  the  interferer  and  noise  components 
are  independent  and  radially  symmetric,  we  have 
al=aj=  so  that 

C!licix,y))  =  -^<^1  + 

=  o^=al  +  Zpk^l 

*=i 

This  follows  from  an  easy  computation  using  the 
approximate  density  function  for  p{A). 

Therefore,  the  deflection  for  the  classical  correlator  is 

5(c(x,7))  = 

and  its  square  is  simply  the  signal-to-noise  ratio. 

To  obtain  an  upper  bound  on  the  processing  gain  for 
the  first-order  adaptive  locally  optimum  processing 
algorithm,  we  suppose  that  each  sample  is  assigned 
to  the  correct  state.  This  is  equivalent  to  assuming 
that  the  adaptive  locally  optimum  processing  is 
effective  in  completely  removing  the  interference. 

Given  the  probability  density  function  of  amplitudes 

p{A)  =  -f=-  'Zpje'^^ , 

J2n  G  j=i 
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subject  to  the  constraint  that  a  sample's  amplitude 
must  be  non-negative.  The  gain  factor  for  the  optimal 
first-order  detector  is 


The  -j  term  in  the  gain  factor  can  be  neglected  under 


the  assumption  on  A.  Indeed,  practical  experience 
with  algorithms  implemented  for  communication 
systems  has  indicated  that  excellent  performance  can 
be  achieved  by  use  of  the  modified  gain  factor 

whose  performance  should  be  quite  similar  to  that  of 
the  optimal  detector  when  the  small  signal  hypothesis 
holds  for  all  of  the  discrete  values  of  the  deterministic 
interferer  amplitudes,  the  condition  required  for  the 
model  being  analyzed  to  be  applicable.  We  now 
proceed  to  obtain  the  processing  gain  for  the 

first-order  detector  D i  (x,y)  =  .  ]) 

Izl  \s\ 

associated  with  this  modified  gain  factor. 


Under  the  assumption  that  each  sample  is  assigned  to 
the  correct  state: 


iDdx,y))  =  (Ddx,y)) 


1 


Figures  6a  and  b  present  processing  gain  upper  bound 
contours  for  a  two-state  noncentral  mixture  model 
first-order  detector  for  0  </?/,<  1 ,  0  <  ^i//  <  20, 

|ii  =  1,  and  ol„  =  0.5  and  cl„  =  2,  respectively. 

The  curves  show  that  the  processing  gain  is  a  strong 
function  of  the  level  of  the  background  noise  and  a 
weak  function  of  the  low-state  probability. 

Consider  the  first-order  detector  for  symmetric  phase 
differences  modeled  by  a  noncentral  mixture  model. 
Given  the  probability  density  function 
s  . 

^(AO)  =  X  Qk  - —  e  20^  . 

*=i  v27c  a 

The  approximation  is  only  good  when  the  phase 
contribution  of  the  Guassian  noise  \\f  to  the  received 
signal  phase  is  small,  that  is 

i::„9(A0we=Ji9(Ae)r/e. 

For  this  density  the  first-order  detector  is 

Di(x,y)  =  hdAB)m^j^]) 

with 

5 

hiiAe)  = - 7=^X(A0-X>ye'  . 

q{AQ)j2KO^Mi 

Under  the  assumption  that  each  sample  is  assigned  to 
the  correct  state  as  for  the  amplitude  case, 


—  En-¥s  {[projuis)  +projuin)] 


=  E„+si[proju{s)]-^) 

from  the  earlier  discussion  of  a  one-state  model  for 
amplitudes.  Under  the  added  assumption  that  the 
phase  of  the  signal  relative  to  the  interferer  is 
uniformly  distributed,  the  expected  value  of  the 
projection  is 


ybl  .  Also,  a2(D,(x,:y))  =  so  that 


5^(Di  (x,>'))  =  It  follows  that  the  processing 

gain,  under  the  assumptions  that  each  sample  is 
assigned  to  the  correct  state  and  the  relative  phase  of 
signai  to  interference  has  uniform  distribution,  is 


E„^u+dDi  {x,y))  =  En+s  iDi{x,y))^ 

=  E„^AhdAQ)m^^]) 


“  Ert+s  {[projiu(As)  +projUAn)] 


=  E„+si\projiu{As)]-^). 


To  evaluate  this  expectation,  we  assume  that  the 
signal  sample  j  is  uncorrelated  with  signal  samples  j-1 
and  j+1,  and  that  the  relative  phase  of  the  signal  to  the 
interferer  is  uniform.  Under  the  former  assumption 


£n+s(l;?rayi„(As)]-^)  =  E„+si\projAs)]-^), 
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and  under  the  latter  assumption,  as  for  the  amplitude 
case, 

£»+5([proy«(^)]^)  =  jbl . 

However,  under  the  assumption  of  independence  of 
the  successive  samples  of  the  broadband  noise 
component,  the  variance  of  the  broadband  noise 
component  increases  due  to  the  adjacent  noise  terms 
so  that 

It  follows  that  the  deflection  squared  for  the  first-order 
detector  of  symmetric  phase  differences  is 


ibii 

3 


bn 


There  is  no  traditional  phase  processing  so  we  use  the 
same  traditional  processing  deflection  value  for  phase 
as  for  amplitudes.  Then  the  processing  gain  for 
symmetric  phase  differences  depends  only  on  the 
amplitude  model  as  defined  for  the  amplitude  case 
considered  in  this  section.  The  processing  gain  for  the 
first-order  detector  of  symmetric  phase  difference  is 
then 


1 


Note  that  due  to  the  assumption  that  each  sample  is 
assigned  to  the  correct  state,  the  structure  of  the 
symmetric  phase  model  plays  no  role  in  the 
achievable  processing  gain  for  the  symmetric  phase 
difference  processing  gain. 


I .  2 .  •  ■  4  S'  It  follows  that  the  variance  of  the 

real  components  samples  in  state  k  is 

lil  rn  li] 

cos^vjrrAir  =  —  and  the  variance  of  the 

2lt  J-n  2 


I  2 

interference  is  given  by  ■:r  •  Thus 

^  *=i 


oI{T2(z))  =  +projun)*] 

=  +  Su^proJlin)  +projt{n)] 

s  s 

=  +  (>i'LPk\il)ol„  3at]. 

*,1 

It  follows  that  the  deflection  for  the  traditional  detector 
is 

Pkik*  Pklkl)oln 


To  obtain  an  approximate  upper  bound  on  the 
second-order  detector,  we  assume  that  each  sample 
is  assigned  to  the  state  containing  its  discrete 
component  of  interference  so  that  the  deflectbn  for 
the  multistate  detector  is  the  same  as  for  the  one-state 
detector,  that  is 

1  Isl^ 
ol' 

It  follows  that  an  approximate  upper  bound  for  the 
processing  gain  for  a  second-order  detector  when  the 
interference-plus-noise  amplitudes  are  described  by  a 
noncentral  mixture  model  is 


Processing  Gain  for  a  Noncentral  Mixture  Model 
Second-order  Detector. 


Suppose  that  the  probability  density  function  for  the 
noncentral  mixture  model  for  amplitudes  is 

s  -ill*!! 

p{A)  =  ~iA - Zpke  <  ,A>0. 

V27C  <y/,n  *=1 

For  the  traditional  processing 


with  the  total  variance  of  interferer-plus- 
Gaussian-noise  real  and  imaginary  components  of  the 
baseband  samples.  Note  that  the  interferer 
component  takes  on  discrete  values  of  magnitude 


—^^JI>Pk^t  +  6(ZPk^l^)(yl  +  3al  . 
2j3alJk^i 

Figures  7a  and  b  show  the  above  processing  gain 
upper  bound  contours  for  a  two  state  noncentral 
mixture  model  for  0  <;2z,  <  1 , 0  <  <  20,  =  1 , 

and  =  0.5 and  =  2.  respectively. 

We  have  been  unable  to  establish  even  an 
approximate  upper  bound  for  a  noncentral  distribution 
of  symmetric  phase  differences  second-order  detector. 
The  difficulty  is  that  the  argument  used  to  obtain  the 
only  results  we  have  for  the  single  state  model  is 
inconsistent  with  the  assumption  that  the  symmetric 
phase  differences  are  described  by  a  multistate  model. 
A  further  complication  is  that  it  is  unreasonable  to 
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suppose  for  a  multistate  symmetric  phase  difference 
model  that  the  amplitudes  of  the  interference  samples 
described  by  the  model  are  independent  of  the  state 
containing  the  symmetric  phase  difference.  However, 
we  have  included  the  model,  because  in  practice  the 
nonparametric  implementation  for  the  symmetric 
phase  difference  algorithms  has  proven  effective  in 
communications  (Bond  and  Schmidt,  to  appear)  and 
further  investigation  of  beamformer  output  statistics 
may  reveal  its  applicability  to  ocean  basin  surveillance. 
This  completes  our  treatment  of  noncentral  mixture 
models  of  amplitudes  and  symmetric  phase 
differences. 

Processing  Gain  for  a  Gaussian  Mixture  Model 
First-order  Detector. 


Let  the  probability  density  function  of  the  noise  be 

*=i  Inci 

and  let  the  signal  of  known  structure  be 
j  =  (d,  b)  =  a  +  ib. 


First,  consider  the  classical  correlator: 
_  ax+by 


c{x,y)  = 


We  have 


\s\ 


(^)  E„(c{x,y))=0 

{2)E  ic{x,y))  =  bl 

(3)  clicix^y))  =  =  ^2 


Therefore,  the  deflection  for  the  classical  correlator 


and  its  square  is  simply  the  signal-to-noise  ratio. 
Furthermore,  the  performance  does  not  depend  on  the 
details  of  the  Gaussian  mixture  model.  Indeed,  the 
computations  only  used  the  fact  that  the  noise  had 
zero-mean  and  variance 


Recall  that  the  deflection  of  the  first-order  detector 
satisfies: 

5^(Z)i)  =  J/^2  D\(x,y)p{x,y)dxdy. 

For  Gaussian  mixture  noise. 


1  „  20? 


Ddx,y)  = 


2ml 


^k=\Pk _ 2  ^ 


2iio; 


By  the  Cauchy-Schwartz  inequality: 

SS  ,  ax+by  _  jo? 

*=l(— 

D\{x,y)< - i - 


z2+y2 

i  ..  2o? 

^*=1  Pk  ,2  ^ 


2to: 


Therefore, 


.ax  +  by 


=  i^(a^al  +  b^al)  =  ±^\s\^ 

*=i  a,  .  .  — 


2 

S 

I 

*=i  a] 


2nal 


dxdy 


We  can  obtain  an  approximation  of  in  the 
following  way.  Instead  of  applying 
the  Cauchy-Schwartz  inequality,  use  the  fact  that  pt  is 
an  unbiased  estimator  of  p(^lz)  to  obtain  an 
approximation.  Expand  the  product 


Di(x,y)  = 


*=i  a; 


and  note  that  the  cross-terms  are  expected  to  be  small 
compared  to  the  square-terms.  Therefore,  it  is 
reasonable  to  assume  that 


Dt(x,y)  =  Z( 


_  4^, ax  +  by 


fp^(klz) 


t=\ 

s 


= ypkpikiz). 


k=\ 


With  this  approximation, 

''^*=1  <T*  PW 

S  ^2 

*•'  o; 

and  so  the  deflection  of  the  first-order  detector  is 
approximately 


27 


bl 


2  s  p\ 
-J  • 


It  follows  that  an  upper  bound  on  the  processing  gain 
provided  by  the  square  of  the  ratio  of  the  deflections  of 
the  first-order  detector  and  the  correlator  is 

s  2 

gi(pi,Oi,..,/>s,<Ts)  = 

fc=l  o* 

Note  that 


s  s 

*-i  fc.1  * 

S  J  s 

^  (X P*-V)(ZpitOt)  =  gl (p  1 ,  Oi ,  ...,P5,  Os). 

iNl  tsl 


real  numbers  whose  sum  is  1  such  that 
'^txjuKPk^l  =  o|  •  Let  Pk  =p  and  p*  =  ( 1  -p)p* 
for  k^K. 

Then 

N 

Xp*  =  1 


=  Xp*o* 

N 

=  (1  ~p)  X  P*o*  +PjcO^  = 

t»l  jheAT 

SO  that  the  overall  variance  is  equal  to  that  of  the  K-th 
state  and  independent  of  p.  Then 


The  above  formula  can  be  used  to  obtain  a  global 
processing  gain  bound  for  a  muitistate  model  in  terms 
of  a  two-state  model.  The  processing  gain  global 
bound  for  any  multistate  model  first-order  detector  for 
specified  interference  variance  and  specified  highest 
state  to  lowest  state  variance  (a  measure  of  the 
dynamic  range  of  interference  power)  is  provided  by 
the  two-state  model  with  these  parameters.  The 
proof  is  presented  in  appendix  A.  Thus,  it  is  natural  to 
investigate  two-state  Gaussian  mixture  model 
first-order  detectors  to  assess  how  much  processing 
gain  is  available.  Additional  evidence  of  the  central 
role  of  multistate  models  with  two  or  three  states  is 
provided  by  an  application  of  the  upper  bound 
obtained  here  to  Middleton  Class  A  noise  first-order 
detectors.  This  discussion  can  be  found  in  appendix 
B. 

Figure  8  shows  the  upper  bound  for  processing  gain 
gi (Pi ,  <^1  .P2.  ^2)  decibels  as  a  function  of 

Pi  and  p  =  — .  Results  are  presented  for 

p  with  1<  p  <  20,  for  which  the  greatest  processing 
gain  is  7  dB. 

The  lower  bound  of  1  is  sharp  if  restraints  are  not 
placed  on  the  state  probabilities  (either  explicitly  or 
implicitly).  Consider  a  muitistate  model  of  N>2 
states.  It  is  then  possible  to  increase  the  probability 
of  any  internal  state  (not  the  lowest  or  highest 
variance  state)  while  leaving  that  all  of  the  state 
variances  equal.  Let  K  be  the  index  of  the  internal 
state  whose  probability  is  to  be  chosen  as  arbitrarily 
high.  Let  {p* 1 1  <k<N,k^K}be  any  set  of  positive 


limXp*(-^) 

*=1  O* 

=  liin{  i  = 

Using  the  estimate  obtained  earlier  for  the  square  of 
the  deflection,  we  obtain  the  following  estimate  for  the 
processing  gain  for  the  detector. 

ol 


Figure  9  presents  processing  gain  contours  for  a  two- 
state  Gaussian  mixture  model  first-order  detector. 

The  contours  are  for  low-state  probability  0  <p/.  <  1 
and  high-state-to-low-state  variance  ratios 

1  <  -^  <  20. 

Figure  10  presents  the  differences  between  the 
processing  upper  bounds  and  processing  gain 
estimates  for  a  two-state  mixture  model  first-order 
detector.  These  contours  estimate  the  losses  in 
processing  gain  due  to  use  of  probabilistic  assignment 
of  samples  to  the  states  of  the  model,  in  other  words, 
losses  encountered  because  the  first-order  detector 
assigns  the  samples  to  fuzzy  sets.  These  losses 
might  be  avoided  through  exploiting  temporal 
correlations  between  successive  samples.  Figure  10 
suggests  that  there  is  a  small  penalty  paid  for  treating 
the  interference  samples  as  independent  even  when 
they  are  correlated. 


l.p\ 
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Processing  Gain  for  a  Gaussian  Mixture  Model 
Second-order  Detector. 


Observe  that 


-  Pn{z) 

]__(*) 

Pn{2)  k\ 
where  we  use  convolutional  expansion  for  the 
distribution  of  signal  plus  interference.  The  last 

expression  equals  [^"  ]^pn(z)  dz  + 

2  Pn\Z) 


or  2 

higher  order  terms  in  the  moments  of  s,  = 
neglecting  the  higher  order  terms  in  the  moments  of  s. 


It  follows  that 


>D 


To  obtain  a  bound  on  the  variance  Oy, ,  observe  that 


*=1 


si(---—M)2-Lp(tb) 


*=1 


by  the  Cauchy-Schwartz  inequality  (Halmos,  1957). 
Note  that  equality  would  hold  if  all  the  piklz)  were 
either  0  or  1 ,  that  is,  each  sample  was  assigned  to  the 
appropriate  state.  Finally, 

2  ^  f“  2  1  piz\k) 

oh  ^  J  Z( - ^)^-^p,^^:j^piz)dz 

•  -"•O  «  a 


k=i 


Ok 


p{z) 


=  4Z^. 


*=1  Ot 


Therefore,  an  upper  bound  for  the  deflection  of  the 
adaptive  locally  optimum  processing  quantity  is  given 
by 

^k=l 

In  addition,  the  upper  bound  obtained  is  expected  to 
be  fairly  sharp  since  it  was  obtained  by  dropping 
high-order  terms  in  the  magnitude  of  the  signal  and  by 
supposing  that  the  state  assignment  process  yields 
the  same  results  as  a  deterministic  state  assignment 


process  without  errors.  Both  of  these  assumptions 
underlie  the  derivation  of  the  adaptive  locally 
processing  algorithm  under  discussion. 

We  can  also  obtain  an  approximation  of  Op  in  the 
following  way.  Instead  of  applying  the 
Cauchy-Schwarz  inequality,  we  use  the  fact  that  pt  is 
an  unbiased  estimator  of  p(k\z)  to  obtain  an 
approximation.  Expand  the  product 
s  -2a^  1 

; — -)-^p{k\z)]'^ ,  and  note  that  the 


i=l 


Ok 


cross-terms  are  expected  to  be  small  compared  to  the 
square-terms.  Therefore,  it  is  reasonable  to  assume 
that 

\y\2  _  t 

l£(- 


*=1 


_2J 

Ok 


*=i  at  o! 

s  \z\^-2a^  1 


*=I  o*  o* 
With  the  above  approximation. 


ol=]  K - j-^)^—pli^^piz)dz 

S  pZ 
*=i  a* 

and  therefore  the  deflection  is  approximately 


The  numerator  of  the  deflection  for  the  traditional 
detection  quantity  when  the  noise  is  described  by  a 
Gaussian  mixture  model  is  given  by 

E,Mz)]=^. 

CSn 

The  variance  of  the  traditional  detection  quantity  for 
noise  only 

I_I2 

\2l 


On 


^-AdA. 

“  zr-Z  Jo  0^2 


*=i  a,  2o„ 
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where  the  double  integral  in  x  and  y  is  replaced  by  a 
double  integral  in  polar  coordinates  and  6  (for  which 
the  integrand  is  a  constant)  is  integrated  out.  This 
integral  can  be  evaluated  using  integration  by  parts  to 
obtain 

cT^r  =  2lp*^-l. 

*=1  On 

Therefore  the  deflection  for  the  traditional  detection 
quantity  is 


The  upper  bound  for  processing  gain,  the  ratio  of  the 
deflections,  can  be  written  in  terms  of  model 
parameters  alone; 


The  above  formula  can  be  used  to  obtain  a  global 
processing  gain  bound  for  a  multistate  model  in  terms 
of  a  two-state  model.  The  processing  gain  global 
bound  for  any  multistate  model  second-order  detector 
for  specified  interference  variance  and  specified 
highest-state-to-lowest-state  variance  is  provided  by 
the  two-state  model  with  these  parameters.  The 
proof  is  similar  to  that  of  the  same  result  for  first-order 
detectors  and  is  presented  in  appendix  A.  Thus  it  is 
natural  to  investigate  two-state  Gaussian  mixture 
model  second-order  detectors  to  assess  how  much 
processing  gain  is  available.  Additional  evidence  of 
the  centrai  role  of  multistate  models  with  two  or  three 
states  is  provided  by  an  application  of  the  upper  bound 
obtained  here  to  Middleton  Class  A  noise  model 
second-order  detectors.  This  discussion  can  be 
found  in  appendix  B. 

The  lower  bound  of  1  is  sharp  if  restraints  are  not 
placed  on  the  state  probabilities  (either  explicitly  or 
implicitly).  Consider  a  multistate  model  of  >  2 
states.  It  is  then  possible  to  increase  the  probability 
of  any  internal  state  (not  the  lowest  or  highest 
variance  state)  while  leaving  all  the  state  variances 
equal.  Let  K  be  the  index  of  the  internal  state  whose 
probability  is  to  be  chosen  as  arbitrarily  high.  Let 
{pi 1 1  <k<N,k^  K]be  any  set  of  positive  real 


numbers  whose  sum  is  1  such  that 

N 

X  Pkol  =  ol  .  Let  pk=p  and  p*  =  ( I  -p)pk 

k=lJc*K 

,V  N 

for  k^K  .  Then  2 *  and  C'  =  X PkO^ 

k=\  *=l 

=  (1  -p)  Htu^KPkol  +Pk<^1  =  so  that  the 

overall  variance  is  equal  to  that  of  the  K-th  state  and 
independent  of  p.  Then 

lim  'LPki^y 

p-*'  *=l  CJ* 

=  lim{  Z  i\-p)pk{^y+pi^)^}  =  \ 

P~*^  k=lJrfK  Oi^  Of^ 

and 

lim  ^Pki-j-)^ 

/>-♦>  *=i  a 

=  lim2{  i  (l-/>)p*(-^)'+p(-^)'}-l  =  l. 
k=ij[*K  a  a 

Using  the  estimate  obtained  for  the  deflection  of  the 
second-order  detector,  we  can  obtain  an  estimate  of 
the  processing  gain  for  the  second-order  detector  as  a 
function  of  mixture  model  parameters.  This  estimate 
is  given  by 


If  S 

J 

2Epk^-l 

H*-'  <^kJ 

\  *=1  On  / 

Figure  1 1  shows  the  upper  bound  for  processing  gain 
for  the  second-order  detector  for  a  two-state  Gaussian 

ojf 

mixture  model  in  decibels  as  a  function  of  pi  and  — 

o]f 

for  0  <  1  and  fori  <  — ^  <  20.  For  this  range  of 

< 

parameters  the  upper  bound  for  processing  gain  is  12 
dB  for  the  second-order  detector.  Figure  12  shows 
the  processing  gain  estimates  for  the  second-order 
detector  for  the  same  ranges  of  parameter  values  as 
used  to  obtain  the  results  presented  for  the  processing 
gain  upper  bounds.  Figure  13  presents  contours  of 
the  difference  between  the  processing  gain  upper 
bounds  and  the  processing  gain  estimates.  The 
differences  shown  in  figure  13  are  estimates  of  the 
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losses  due  to  assignment  of  samples  to  states 
defining  fuzzy  sets. 

Figures  14  and  15  present  processing  gain  estimates 
and  processing  gain  tx)unds  for  a  three-state  model, 
where  pi  =  0.25.  Pm  =0.5.  pn  =0.25.  Oi  =  1 ,  Oat 
varies  from  2  to  8.  and  <5h  varies  from  8  to  44.  This 
example  was  chosen  to  illustrate  the  processing  gain 
for  a  class  of  three-state  models  shown  earlier  to  arise 
from  modal  interference  at  beamformer  outputs.  The 
figures  show  that  there  is  substantial  gain  in  the  region 
of  middle-state  and  high-state  variances  considered 
and  that  processing  gain  is  mainly  a  function  of  the 
high-state  variance.  This  is  because  the  deflection  for 
the  second-order  detector  is  almost  constant  for  the 
range  of  middle-state  and  high-state  variances 
considered  and  the  deflection  of  the  traditional 
detector  is  heavily  influenced  by  the  high-state 
variance.  The  difference  between  the  estimate  and 
the  upper  bound  for  the  processing  gain  varies  from 
2.8  to  3  dB. 

Performance  Comparisons  Using  Deflection. 


During  the  theoretical  development  of  adaptive  locally 
optimum  processing  algorithms  suitable  for  processing 
interferer  Gaussian  mixture  models  for  signals  of 
unknown  structure,  we  have  obtained  upper  bounds 
and  estimates  of  processing  gain  and  identified 
algorithms  based  on  parametric  and  implicit  models  of 
the  interference.  In  this  subsection,  the  deflections 
obtained  through  simulations  of  the  detectors  are 
compared  to  the  analytical  results.  The  simulations 
address  the  performance  of  the  second-order 
detectors  at  the  sample  level.  In  the  next  subsection, 
we  address  second  order  line  detector  performance. 
We  concentrated  on  establishing  results  for 
second-order  detectors  for  Gaussian  mixture  models 
because  they  have  proven  applicability  to  undersea 
surveillance. 

The  signal  and  interferer  models  for  the  simulations 
are  baseband  models.  The  simulation  baseband  signal 
model  is  a  10  Hz  sinusoid  with  unit  amplitude; 

s  =  cos  20jr/-i-  i  sin  207Cf. 

The  baseband  sample  rate  is  128  samples  per 
second. 

The  expected  values  of  the  real  and  imaginary 
components  of  the  baseband  signal  for  the  second- 
order  detector  are  zero  and  their  variances  are 
one-half  for  the  majority  of  the  simulations.  This 


signal  level  was  chosen  to  illustrate  the  performance 
of  the  second-order  detector  when  the  validity  of  the 
Taybr's  expansion  about  zero  signal,  which  is  used  to 
derive  the  detector,  begins  to  be  questbnable. 

For  each  simulation  run  the  statistics  are  assumed  to 
be  described  by  a  two-state  Gaussian  mixture  model. 
The  low-state  variance  <s\  =  \,  while  the  other  mixture 
model  parameters,  pi  and  are  varied.  A  basic 
unit  for  a  simulation  run  is  a  trial  consisting  of  512 
samples  of  the  interference.  For  each  trial,  the 
average  deflection  is  calculated  for  the  samples 
without  signal  and  the  same  samples  with  signal. 
Statistics  were  collected  for  a  minimum  of  15  trials  to 
allow  for  the  calculation  of  the  mean  and  standard 
deviation  of  the  average  deflection  for  a  simulation 
run. 

Simulations  were  conducted  for  the  traditional  detector 
and  three  second-order  detector  cases; 

(a)  Gaussian  mixture  model  with  exact 
parameters, 

(b)  Gaussian  mixture  model  with 
predetermined  errors  in  the  parameters,  and 

(c)  Gaussian  mixture  model  represented  by  an 
implicit  mixture  model. 

The  average  deflections  values  are  compared  with  the 
analytically  obtained  upper  bound  on  deflection  and 
the  analytically  obtained  estimate  of  deflection  as 
functions  of  Gaussian  mixture  model  parameters.  The 
relationship  between  single  sample  detection  and  false 
alarm  and  line  detection  is  discussed  in  the  next 
subsection;  probability  of  line  detection  versus 
probability  of  false  alarm  is  related  to  the  average 
deflection.  These  results  relate  average  deflection  as 
a  measure  of  performance  to  traditional  descriptions  of 
line  detector  performance  by  probability  of  detection 
versus  probability  of  false  alarm  for  a  given  input 
signal-to-noise  ratio. 

Deflection  Bounds  and  Deflection  for  Known 
Parameters. 


Earlier  we  obtained  an  upper  bound  for  the 
second-order  detector  deflection  as  a  function  of 
mixture  state  parameters.  This  upper  bound 
provides  the  value  of  deflection  for  a  clairvoyant 
detector,  that  is  for  a  detector  with  each  sample 
assigned  to  the  correct  state.  An  estimate  of  the 
deflection  as  a  function  of  mixture  state  parameters 
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was  also  obtained.  Here,  we  address  the  accuracy  of 
the  estimate  of  deflection.  Detailed  results  are 
presented  for  two  families  of  mixture  models:  (1) 

=  1 .  =  4,  and  Pi  =  0. 1 , 0.2,  0.3,  0.4,  and  0.5 

and (2)  =  1 ,  =  1 6,  and  pi=0.\,  .2,  0.3,  0.4, 

and  0.5.  The  mean  and  standard  deviations  of  the 
deflection  values  for  15  trials  are  presented  in  figures 
16  and  18  with  corresponding  processing  gains  in 
decibels  presented  in  figures  17  and  19. 

The  traditional  detector  curve  would  be  smooth  in 
figures  16  and  18  except  that  the  curve  is  obtained  by 
simulation.  The  upper  bounds  and  estimates  of 
deflection  are  calculated  from  the  formulas  derived 
earlier.  The  simulation  data  for  the  two-state 
Gaussian  mixture  model  second-order  detector  are 
shown  by  the  curves  which  are  not  labeled  in  figures 
16  and  18.  The  average  deflections  obtained  are 
presented  with  vertical  line  segments  connecting  the 
average  plus  and  minus  one  standard  deviation  of  the 
15  trials  run  to  obtain  the  average  value  of  deflection. 
Figures  17  and  19  present  the  processing  gains  in 
decibels  obtained  as  the  difference  is  decibels  of  the 
deflections  for  the  second-order  detector  and  the 
traditional  detector. 

There  is  very  little  processing  gain  available  when 
cl  =  1  and  =  4  as  indicated  by  figure  17.  The 
estimate  seems  to  be  a  tittle  optimistic,  estimating  a 
processing  gain  from  less  than  0.5  dB  to  about  2  dB, 
while  the  simulations  indicate  from  0  dB  to  about  1 .25 
dB.  These  simulations  indicate  the  adaptive  locally 
optimum  processing  should  not  be  used  for  a 
two-state  model  with  high-state  to  low-state  variance 
ratio  below  4.  The  estimate  is  remarkably  accurate 
for  the  case  =  1  and  =  1 6  with  the  estimates 
and  simulation  processing  gains  differing  by  less  than 
0.5  dB.  For  this  case  the  processing  gain  varies  from 
about  3  to  8  dB. 

Figures  20,  21,  and  22  present  the  average  deflection 
values,  processing  gain,  and  fuzzy  set  losses, 
respectively,  for  =  1 ,  p/,  =  0.1,  0.2,  0.3,  0.4,  and 
0.5,  and  =  4.9, 16,  25,  36,  49,  and  64.  The 
different  line  types  for  the  three  figures  correspond  to 
the  same  cases.  In  figure  21 ,  the  processing  gain  is 
an  increasing  function  of  high-state  variance  since  the 
low-state  variance  is  fixed.  Note  that  figure  20  exhibits 
deflection  as  a  function  of  mainly  low-state  probability 
and  low-state  variance.  This  is  because  the  adaptive 


locally  optimum  processing  algorithms  emphasize  the 
samples  estimated  to  be  in  the  low  states  over  those 
estimated  to  be  in  the  high  states.  The  fuzzy  set  loss 
is  also  a  function  for  the  simulations  under  discussion 
of  the  low-state  probability,  decreasing  in  general  from 
about  4  dB  for  a  low  state  probability  of  0.10  to  about 
2  dB  for  a  low  state  probability  of  0.50.  Figures  21 
and  22  indicate  that  most  of  the  processing  gain  can 
be  achieved  by  modeling  the  interference  samples  as 
independent. 

Dependency  of  Processing  Gain  for  Gaussian  Mixture 
Models  on  Model  Parameter  Errors. 


Simulations  were  conducted  to  determine  the 
dependency  of  average  deflection  for  the  two-state 
Gaussian  mixture  model  on  parameter  errors. 
Simulations  were  conducted  for  true  two-state 
Gaussian  mixture  models  given  by  pi  =  0.3,  =  1 

and  =  6,  10,  and  16.  The  error  sensitivity 
analysis  was  performed  by  supposing  that  the 
low-state  probability  and  low-state  variance  were  in 
error,  with  the  overall  variance  for  the  model  accurate. 
The  bw-state  probability  was  varied  from  0.1  to  0.8  in 
increments  of  0.1  and  the  low-state  variance  from  0.5 
to  1.75  in  increments  of  0.25.  The  contours  are 
generated  using  the  contour  plot  routine  of  MATLAB. 
Thus  the  high-state  variance  is  determined  from  the 
supposed  low-state  probability  and  bw-state  variance. 
Figures  23,  24.  and  25.  present  processing  gain  bss 
contours  as  a  function  of  parameter  error  for  the  three 
high-state  variance  cases.  The  figures  show  that  the 
processing  gain  achieved  by  using  a  two-state 
Gaussian  mixture  model  is  not  sensitive  to  bw-state 
variance  and  bw-state  probability  errors.  The 
processing  gain  is  nearly  flat  near  the  true  state. 
Indeed,  in  figure  23,  possibly  due  to  statistical 
variatbns  in  the  estimation  of  the  processing  gains, 
which  are  very  sensitive  to  the  estimates  of  deflection 
for  the  traditional  detector,  the  maximum  processing 
gain  was  not  achieved  for  the  model  with  exact 
parameters,  but  rather  slightly  more  gain  (0.1  dB)  was 
realized  for  a  number  of  cases  with  higher  bw-state 
probabilities  than  that  for  the  true  parameters  within 
the  0-dB  bss  contour.  Generally  speaking,  there  is 
more  sensitivity  to  the  bw-state  variance,  'n 
particular,  it  appears  to  be  better  to  overestimate  the 
bw-state  variance  than  to  underestimate  it. 

The  lack  of  sensitivity  of  the  processing  gain  to 
bw-state  probability  suggests  that  modeling  the  noise 
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statistics  on  adjacent  frequency  bins  to  that  containing 
the  signal  for  an  interferer  occupying  four  or  more 
adjacent  bins  should  provide  sufficient  accuracy  to 
support  effective  adaptive  locally  optimum  processing. 
The  modal  propagation  effects  described  in  an  earlier 
section  leads  to  periods  of  low-state  variance,  the  low 
variance  values  during  these  periods  should  be  nearly 
the  same  for  adjacent  frequency  bins,  while  it  might  be 
expected  that  the  durations  and  frequency  of 
occurrence  of  the  periods  might  vary  more,  which 
would  lead  to  the  low-state  probabilities  varying  from 
frequency  bin  to  frequency  bin. 

Dependency  of  Processing  Gain  for  Gaussian  Mixture 
Model  on  Implicit  Model  States. 


For  the  implicit  models,  the  number  of  states  is  the 
same  as  the  number  of  samples  used  to  construct  the 
model.  The  initial  trials  for  implicit  models 
constructed  by  using  16.  32,  and  64  samples  were 
conducted  for  pi-  0.1 .  0.2.  0.3,  0.4,  and  0.5,  =  1 

and  =  1 6  .  The  simulation  approach  assures  that 
the  samples  used  to  model  the  interference  are 
independent  samples.  The  appropriate  way  to 
interpret  the  implicit  model  results  is  that  the  number 
of  samples  correspond  to  the  number  of  independent 
samples  of  the  interference  for  an  implementation  of 
the  algorithm.  For  each  trial,  new  samples  were 
selected  to  construct  the  implicit  model  used  to  obtain 
results  for  the  512  samples  drawn  from  the  Gaussian 
mixture  model  distribution.  Figures  26,  28,  and  30 
summarize  the  deflection  results,  while  figures  27, 29, 
and  31  summarize  the  corresponding  processing  gain 
results.  It  is  clear  from  an  examination  of  these 
figures  that  there  is  some  loss  of  performance  for 
using  a  rTKXlel  constructed  from  16  samples,  but  very 
little  loss  from  using  a  model  constructed  from  32 
samples  in  comparison  with  64  samples.  Thus  an 
implicit  model  can  be  built  using  about  one-fourth  the 
number  of  samples  needed  to  estimate  model 
parameters  using  the  EM  algorithm.  The  modeling 
loss  for  32  samples  is  less  than  1  dB.  Thus  the 
implicit  modeling  approach  provides  an  alternative  to 
the  EM  algorithm.  It  is  clear  that  the  implicit  model 
can  be  used  to  model  interference  with  rapidly 
changing  statistics. 

Figure  32  presents  average  deflections  with  standard 
deviations  of  the  15  trials  used  to  obtain  the  averages 
for  implicit  models  using  32  samples  for  a  low-state 


variance  of  1  and  high-state  variances  of  4.  Figure  33 
presents  the  processing  gains  corresponding  to  the 
deflections  presented  in  figure  32.  Figure  33 
indicates  that  implicit  modeling  should  not  be  used 
when  the  ratio  of  high-state  to  low-state  variance  is  4 
or  less. 

Performance  contours  as  a  function  of  true  two-state 
Gaussian  mixture  model  pc'^imeters  were  obtained  for 
the  second-order  detector  using  for  implicit 
interference  modeling  by  32  samples.  Figures  34,  35, 
and  36  present  the  average  deflection  values, 
processing  gain,  and  modeling  losses  for  32  samples, 
for  Oi  =  I ,  Pi  =0.1,  0.2,  0.3,  0.4,  and  0.5,  and  = 
4,  9.  16,  25,  36.  49,  and  64.  The  different  line  types 
for  the  three  figures  correspond  to  the  same  cases. 

As  a  result,  the  different  cases  can  be  inferred  by 
examining  figure  35  from  the  fact  that  performance 
increases  with  increasing  hign-state  variance. 

The  implicit  model  deflection  curves  shown  in  figure  34 
are  ciosely  clustered  indicating  that  the  performance 
depends  mainly  on  the  low-state  parameters.  The 
results  presented  in  figure  34  suggest  that  the  process 
gain  penalty  shown  in  figure  35  for  implicit  modeling  by 
32  samples  results  from  the  fact  that  the  model 
membership  function  is  a  Bessel  function  rather  than  a 
Gaussian  distribution.  Observe  from  figure  35  that 
use  of  the  implicit  model  leads  to  gains  except  for  the 
case  of  high-state  variance  4  already  discussed. 

Figure  36  indicates  that  provided  the  probability  of  the 
bw  state  is  at  least  0.2  (and  by  inference,  but  not 
demonstrated,  less  than  0.8)  the  implicit  modeling  loss 
is  between  1  and  2  dB.  We  tried  constructing  implicit 
models  by  including  repeats  of  the  state  with  the 
threshold  variance  (the  10  percentile  of  the  sample 
norms)  for  each  sample  less  than  equal  to  the  norm. 
For  every  case,  this  modeling  approach  performed  1 
dB  or  worse  than  the  case  for  which  results  have  been 
presented. 

Implementation  Issues  and  Summary. 


Adaptive  locally  optimum  second-order  detectors  are  a 
natural  extension  of  the  noise  equalization  algorithm 
traditionally  used  to  process  beamformed  ocean 
surveillance  data.  Our  analysis  suggests  that  adaptive 
locally  optimum  processing  can  significantly  improve 
the  detection  of  signals  masked  by  ship-generated 
interfering  lines. 
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The  main  unresolved  issue  for  application  of  locally 
optimum  processing  to  ocean  basin  surveillance  is  the 
nature  of  the  received  narrowband  signal  and 
interference  signals.  Simulations  and  hydrophone 
MDA  data  indicate  that  the  received  narrowband 
signals  could  be  better  described  by  a  multistate 
Gaussian  mixture  model  than  a  one-state  Gaussian 
mixture  model.  The  scatter  plots  of  the  hydrophone 
MDA  data  indicate  that  the  interference  is  better 
modeled  by  a  Gaussian  mixture  model  than  by  a 
noncentral  mixture  model.  Likewise  it  may  be 
assumed  that  the  signal  is  normally  of  unknown 
structure.  Given  the  present  state  of  knowledge,  only 
the  second-order  detectors  for  Gaussian  mixture 
models  for  amplitudes  have  established  relevance  to 
ocean  basin  surveillance.  However,  conceptual 
considerations  indicate  that  first-order  and  second- 
order  detectors  for  noncentral  mixture  models  of 
amplitudes  and  symmetric  phase  differences  and 
first-order  detectors  for  Gaussian  mixture  models  for 
amplitudes  should  be  used  to  complement  the 
second-order  Gaussian  mixture  model  amplitude 
processing. 

Additional  processing  of  beamformed  data  containing 
real  signals  and  interference  should  be  carried  out  to 
determine  whether  the  best  general  approach  to 
implementing  first-order  and  second-order  detectors 
for  Gaussian  mixture  models  should  use  the  model 
approach  or  real-time  parametric  or  nonparametric 
estimates  of  the  interference.  The  robustness  of  the 
two-state  detector  indicated  that  most  of  the 
processing  gain  for  a  second-order  detector  could  be 
obtained  for  approximate  models.  This  suggests  that 
a  reasonable  number  of  models  might  suffice  as 
predetermined  candidate  models  for  the  received 
signal  plus  interference  complex  sample  amplitudes. 
Therefore,  the  modeling  approach  provides  another 
implementation  option. 

The  results  obtained  on  the  characterization  of 
interference  statistics  indicated  more  complexity  of  the 
received  signal  statistics  than  captured  by  a  two-state 
or  three-state  Gaussian  mixture  model.  These  results 
leave  open  the  possibility  that  the  nonparametric 
approach,  which  does  not  require  a  priori  knowledge  of 
the  number  of  the  states  of  the  model,  may  provide 
better  performance  than  the  parametric  approach.  At 
the  present  time,  our  results  suggest  that  either 
approach  provides  considerable  processing  gain  over 
traditional  processing  when  narrowband  interference 
masks  a  signal  of  interest. 


The  overall  scheme  for  the  selection  of  samples  and 
the  processing  of  the  beamformer  output  data  needs 
to  be  decided  before  implementing  the  adaptive  locally 
optimum  processing.  In  particular,  the  detectors 
could  be  implemented  using  block  recursive  with  or 
without  overlap  processing  or  sample  recursive 
schemes.  The  samples  to  model  the  interference  to 
allow  the  adaptive  locally  optimum  processing  of  a 
given  sample  should  probably  be  drawn  in  such  a  way 
from  adjacent  frequency  and  temporal  bins  that  the 
desired  signal  can  occupy  two  adjacent  bins.  The 
total  number  of  samples  needed  to  model  the 
interference  when  the  EM  'Algorithm  is  used  to  extract 
model  parameters  should  be  around  100  samples, 
while  32  samples  suffice  to  model  the  interference 
implicitly. 

The  detectors  in  this  subsection  were  characterized 
for  independent  samples.  Averaging  these  detectors 
leads  to  line  detectors,  whose  performance  is  briefly 
addressed  in  the  next  subsection.  Observe  that  the 
line  detector  for  a  first-order  detector  involves 
coherent  integration,  while  '.hat  for  the  second-order 
detector  involves  noncoherent  integration.  This  is  a 
another  reason  why  the  application  of  first-order 
detectors  to  ocean  basin  surveillance  should  be 
pursued  further. 

Line  Detection  and  Classification. 


Introduction. 


In  the  last  section,  we  established  that  nonlinear 
processing  of  beamformed  data  could  be  viewed  as  a 
generalization  of  the  NSE  algorithm  to  improve 
detection  of  signals  masked  by  narrowband 
interference  described  by  multistate  Gaussian  mixture 
models.  The  nonlinear  processing  treats  the 
frequency  domain  beamformer  output  samples  as 
independent.  In  this  subsection,  we  briefly  discuss 
spatial  cell  combining  techniques  suitable  for  the 
detection  of  narrowband  signals  (lines).  A  theory  of 
such  techniques  tailored  to  adaptive  locally  optimum 
processed  matched  field  beamformer  outputs  has  yet 
to  be  developed.  At  the  present  time,  the  noise 
statistics  of  the  background  after  locally  optimum 
processing  based  on  Gaussian  mixture  models  have 
not  been  characterized  for  either  first  or  second-order 
detectors.  It  is  probably  reasonable  to  treat  this 
background  noise  as  Gaussian  so  that  the  extensive 
work  of  Bar-Shalom  (1988,  1990)  provides  a  starting 
point  for  such  an  investigation. 


34 


Line  detection  and  classification  for  undersea 
surveillance  systems  is  often  performed  by  eye 
integration  of  NSE  processed  data  presented  on 
LOFARGRAMs  as  described  in  the  introduction  and 
background  subsections.  The  operators  are  alerted 
to  lines  of  interest  by  periodic  printouts  listing  lines  of 
interest  by  frequency  and  bearing.  We  suggest  a 
generalization  of  this  approach  for  high  gain  arrays  to 
classify  lines  automatically  detected  by  line  tracking 
techniques  such  as  adaptive  line  enhancement. 

A  classical  LOFARGRAM  is  a  three-variable 
presentation,  Fourier  coefficient  magnitude  (intensity 
of  a  pixel),  frequency  (abscissa),  and  time  (ordinate). 
The  fourth  variable,  bearing,  is  indicated  by  the  beam 
output  whose  output  is  displayed  on  the 
LOFARGRAM.  A  matched  field  beamformer  output  is 
a  function  of  these  variables  as  well  as  range  from  the 
array  and  depth  of  the  source.  In  addition,  due  to 
increased  resolution  in  bearing,  it  is  no  longer 
reasonable  to  base  detection  on  the  a  display  of  data 
from  a  single  beam.  Color  displays  with  chrominance 
and  intensity  allow  the  presentation  of  one  additional 
variable  than  a  monochromatic  LOFARGRAM.  The 
challenge  is  to  reduce  the  number  of  displayed 
variables  to  allow  an  operator  to  confirm  automatic 
detections  and  classify  detected  lines. 

One  approach  is  to  use  adaptive  line  enhancement 
techniques  to  automatically  detect  potential  lines  and 
then  to  structure  the  database  to  allow  display  of  the 
tracked  line  in  a  color  LOFARGRAM-like  display  with 
depth  suppressed  time  intensity  for  a  given  frequency 
(magnitude  of  Fourier  coefficient),  chrominance 
(depth),  bearing  (abscissa),  and  range  (ordinate). 
Chrominance  would  be  used  to  combine  information 
for  a  certain  number  of  depth  bins  around  the  average 
depth  of  the  line  being  displayed.  Mouse  selection  of 
a  sequence  of  adjacent  bins  could  be  used  to  activate 
temporal  histories  of  any  of  the  displayed  parameters 
of  the  line. 

Display  technology  is  rapidly  advancing  and  so  more 
sophisticated  display  technology  may  soon  be 
available  to  allow  the  operator  to  assess  the 
characteristics  of  a  line  detected  by  adaptive  line 
enhancement  processing  than  described  in  the 
previous  paragraph.  To  pursue  this  subject  further, 
we  have  concentrated  our  attention  on  processing  that 
might  be  used  to  automatically  detect  lines.  We  begin 
this  discussion  by  describing  the  relationship  between 
the  detectors  discussed  in  the  last  section  and  line 


detectors.  These  results  provide  an  upper  bound  on 
the  line  detection  performance  that  can  be  achieved 
using  interference  tracking  algorithms. 

Upper  Bounds  for  Line  Detectors. 


Upper  bounds  for  line  detectors  are  obtained  by 
averaging  deflection.  In  a  more  general  context,  the 
results  characterize  performance  when  the  signal  is 
tracked  perfectly.  Given  a  detector  D  :  R, 

recall  that  the  line  detector  for  a  sequence  of  complex 
samples Zi  =Xi  +iy\,...,ZN  =  XN  +  iyN  is 

~  2  ^(Xk,yk)  For  a  given  sequence  of  samples 

with  signal  and  an  equal  length  sequence  of  samples 
without  signal,  and  a  threshold  T,  a  probability  of 
detection  and  probability  of  false  alarm  are  defined  for 
both  the  detector  and  the  line  detector  in  a  natural 
way.  Let  M  »N  be  the  number  of  samples  in  each 
sequence.  Let  M^  be  the  number  of  samples  with 
signal  for  which  D{x,y)  >  T  and  let  Fg  be  the  number 
of  samples  without  signal  for  which  D(x,y)  >  T.  Let 
Mgg  be  the  number  of  subsequences  for  which  the  line 
detector  is  calculated.  Let  M^^  be  the  number  of  these 
subsequences  of  samples  with  signal  for  which 

—  S  ^{xj+k,yM)  ^  T  and  let  F^  be  the  number  of 

^  *=i 

these  subsequences  of  samples  without  signal  for 
which  -jj  2  D{Xj+k,yj+k)  >  T.  Then  the  estimates  of 

^  *=i 

sample  probability  of  detection  and  sample  probability 

of  false  alarm  are  PSd  =  and  PS  fa  =  ^ . 

M  M 

respectively,  and  the  estimates  of  sample  probability 
of  line  detection  and  line  probability  of  false  alarm  are 

PLd  =  —^and  PL  fa  =  •  respectively. 

Mss  Mss 

Figures  37  and  38  are  scatter  plots  of  the  probability  of 
sample  detection  for  a  probability  of  false  alarm  of 
10%  against  the  deflection  for  the  traditional  detector 
and  for  the  two-state  Gaussian  mixture  model 
second-order  detector  for  low-state  variance  1  and 
high-state  variance  6  and  16.  The  data  presented  is 
the  same  data  used  to  generate  the  contour  plots 
presented  in  figures  23  and  24  of  the  previous  section. 
Figures  39,  40, 41 ,  and  42  are  scatter  plots  of  the 
probability  of  line  detection  for  a  false  alarm  of  10% 
against  the  deflection  for  the  traditional  line  detector 
for  the  two-state  Gaussian  mixture  model 
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second-order  line  detector  for  low-state  variance  1  and 
high-state  variance  6,  10,  16,  and  pooled  for  these 
cases,  respectively.  The  line  detection  statistics  are 
obtained  from  the  sample  statistics  by  averaging  128 
samples.  In  each  plot,  the  traditional  detector  results 
are  presented  by  asterisks  and  the  adaptive  locally 
optimum  second-order  detector  by  crosses,  except  for 
figure  42  where "+"  is  high-state  variance  6  data,  "x"  is 
high-state  variance  10  data,  and  "o"  is  high-state 
variance  16  data.  For  every  case  the  low-state 
variance  is  1  and  the  signal  variance  0.5  and  data  are 
presented  for  all  the  different  model  parameter  choices 
to  model  the  underlying  two-state  Gaussian  mixture 
model. 

Some  correlation  between  probability  of  detection  and 
deflection  is  shown  in  figure  37.  For  this  case,  the 
adaptive  locally  optimum  processing  sometimes 
performed  worse  than  the  traditional  detector  and 
sometimes  about  1  to  2  dB  better.  Figure  38  shows 
that  by  the  time  that  the  high  state  to  low  state 
variance  has  reached  16  to  1,  the  probability  of 
detection  and  deflection  have  become  more  highly 
correlated  and  the  data  points  for  the  adaptive  locally 
optimum  processing  are  separated  from  a  cluster  of 
data  points  for  the  traditional  detector. 

Figures  39  through  42  show  how  highly  correlated 
probability  of  line  detection  and  deflection  are  for  the 
data  sets  under  discussion-the  correlation  is 
independent  of  parameter  error.  These  figures  justify 
using  deflection  as  a  criterion  for  the  development  of 
the  detectors.  They  also  clearly  indicate  the 
performance  improvements  derivable  from  use  of 
adaptive  locally  optimum  processing  errors  even  in  the 
presence  of  large  modeling  errors. 

A  series  of  simulations  was  conducted  to  generate 
receiver  operation  characteristic  (ROC)  curves  and 
soft  decision  grams  for  two  state  Gaussian  mixture 
model  second-order  detectors  (Stein,  Bond,  and 
Zeidler,  1993).  The  simulations  were  conducted  to 
establish  the  achievable  performance  gains  using 
adaptive  locally  optimum  processing  techniques. 

The  simulations  investigated  the  performance  of  the 
algorithms  using  a  parametric  description  of  the 
interference.  The  parameters  were  obtained  using  the 
EM  technique.  (See  appendix  B  to  "Gaussian  Mixture 
Models  for  Acoustic  Inference"  for  a  description  of  how 
the  EM  algorithm  was  used  to  provide  parameter 
estimates.)  The  interference  was  assumed  to  be 
described  independent  samples  from  a  stationary 


Gaussian  mixture  model.  In  the  context  of  the  last 
subsection,  if  the  use  of  the  EM  algorithm  to  estimate 
noise-only  statistics  from  adjacent  frequency  bins  to 
the  bin  containing  the  signal  yields  a  model  whose 
parameters  are  approximately  correct,  the 
performance  obtained  should  be  within  1  to  2  dB  of 
that  estimated  by  the  simulations.  The  simulations 
also  address  cases  in  which  the  signal  energy  is 
comparable  or  exceeds  the  low-state  variance.  For 
these  cases,  the  derivation  of  the  algorithms  as 
implementations  of  optimal  detectors  no  longer  apply. 
However,  the  algorithms  can  be  quite  effective  if 
modified  to  compensate  for  the  presence  of  the  signal 
in  the  processed  samples. 

The  ROC  curves  were  generated  in  the  following  way. 
A  two-state  Gaussian  mixture  model  was  chosen. 

Each  run  consisted  of  10,240  independent  trials.  For 
each  independent  trial,  130  independent  samples  were 
generated  for  the  assumed  two-state  Gaussian 
mixture  model  and  the  model  parameters  estimated 
using  the  Estimate  and  Maximize  (EM)  algorithm  to 
obtain  a  detector  for  the  samples  for  a  signal  plus 
noise  and  another  130  independent  samples  were 
generated  and  the  model  parameters  estimated  using 
the  EM  algorithm  to  obtain  a  detector  for  the  samples 
for  noise  alone. 

Probability  of  detection  results  for  different  thresholds 
were  obtained  for  four  detectors: 

(1)  the  traditional  detector, 

(2)  a  detector  with  the  processing  based  on 
noise-only  statistics  (the  detector  obtained  in  the  limit 
as  the  signal  goes  to  zero), 

(3)  a  detector  obtained  by  adjusting  state 
membership  functions  for  the  presence  of  signal,  and 

(4)  a  detector  obtained  by  adjusting  state 
membership  functions  by  locally  scaling  the  estimated 
noise  variance. 

The  traditional  detector  is  an  energy  detector.  The 
detector  with  processing  based  on  noise  only  assigns 
the  individual  signal-plus-interference  samples  to  the 
low  and  high  states  of  the  model  based  on  the  norm  of 
the  interference.  This  detector  cannot  be  achieved  in 
practice.  It  is  included  to  provide  an  upper  bound  for 
an  adaptive  locally  optimum  second-order  detector 
when  the  interference  is  described  by  a  two-state 
Gaussian  mixture  model  and  the  samples  have 
independent  interferer  components.  The  third 
detector  includes  an  adjustment  for  the  presence  of 
signal.  The  variance  of  the  signal  is  estimated  from 
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the  difference  of  the  variance  of  the  samples 
processed  (with  signal  added)  from  the  variance  of  the 
noise,  which  is  known  by  assumption.  The  variance 
of  the  signal  can  then  be  subtracted  from  the  norms  of 
the  samples  to  adjust  the  low-state  and  high-state 
membership  functions.  The  fourth  detector  was 
obtained  by  scaling  the  norms  for  the  samples  with 
signal  so  that  their  norms  provide  the  same  variance 
as  that  for  the  noise  only.  Both  of  these  detectors 
would  reduce  to  the  adaptive  locally  optimum  detector 
when  the  signal  energy  is  small  compared  with  the 
low-state  variance. 

Probability  of  detection  results  were  obtained  for  each 
of  the  four  detectors  for  a  signal  that  was  present  in 
one  frequency  bin  for  the  first  65  samples  and  then 
present  in  the  adjacent  bins  for  the  next  65  samples. 
This  feature  was  incorporated  in  the  simulations  to 
provide  some  indication  of  the  robustness  of  the 
techniques  when  the  duration  of  the  signal  in  any  one 
frequency  bin  is  unknown.  The  detectors  3  and  4 
were  implemented  based  on  the  estimates  obtained 
using  all  130  samples  in  each  of  these  bins. 

Probability  of  false  alarm  estimates  were  obtained  for 
each  of  the  thresholds  used  to  generate  probability  of 
detection  using  the  noise  only  samples. 

ROC  curves  were  generated  for  various  two-state 
Gaussian  mixture  models.  Figure  43  presents  the 

curves  for  pi  =  0.5,  -7=10,  and  (a)  ^  =  2, 

2  2 

(b)  •%  =  1 ,  (c)  4  =  0-5,  and  (d)  4  =  0.25 . 
respectively;  figure  44  presents  the  curves  for 
Pi  =  0.25,  -^  =  10,  and  (a)-^  =  2,  (b) 

(j2  jj2 

(c)  -f  =  0.5 ,  and  (d)  -f-  =  0.25 ,  respectively; 

2 

figure  43  presents  the  curves  for  pi  =  0. 1 ,  -?■  =  4.2 

2  2  2 

and  (a)-^  =  7.3 ,  (b)  -^  =  3.6,  (c)  -^  =  1 .8,  and 
01  ai 

a\ 

(d)  —r  =  0.9,  respectively.  The  processing  gain 

relative  to  the  traditional  detector  is  independent  of  the 
signal  energy.  Modest  processing  gains  of  4.2,  2.5, 
and  1  dB  are  predicted  by  the  estimated  ratio  of 
deflections  obtained  in  the  last  subsection  for  the 
cases  presented  in  figures  43,44,  and  45,  respectively. 
Even  modest  processing  gains  lead  to  clearly 


discernible  improvement  of  probability  of  detection  for 
the  adaptive  locally  optimum  detectors  over  the 
traditional  detector  for  a  given  probability  of  false 
alarm  as  illustrated  by  figures  43,  44,  and  45. 

For  each  case,  the  processing  gain  increases  as  the 
signal-level  to  low-state  variance  decreases,  that  is  as 
the  processing  becomes  more  optimal.  The  figures 
indicate  the  inherent  robustness  of  the  second-order 
detector.  It  still  provides  significant  processing  gain 
under  signal  to  interference  conditions  for  which  the 
derivation  of  the  adaptive  locally  optimum  detector 
breaks  down.  This  is  not  always  a  feature  of  adaptive 
locally  optimum  processing  techniques.  In  particular, 
for  applications  when  coherent  detection  of  the 
reconstructed  signal  is  required,  it  is  definitely  not  the 
case;  signal  distortion  becomes  manifest  for  a  ratio  of 
interference  to  signal  of  2  to  1  (Bond  and  Hui,  to 
appear). 

Target  Tracking. 


It  is  necessary  to  combine  energies  in  successive 
temporal  beamformer  spatial  cells  to  detect  signals 
from  a  target  whose  depth  is  changing  ,  whose  range 
from  the  receiving  array,  and  whose  frequency  may  be 
changing  from  one  cell  to  an  adjacent  cell.  The 
tracking  of  signals  exploits  the  continuity  of  the  signal 
term  in  the  beamformer  output.  In  the  last  subsection, 
techniques  were  described  to  exploit  interference  and 
the  techniques  developed  ignored  temporal  correlation 
of  the  interference.  In  general,  the  correlation  among 
the  signal  component  of  the  sample  cannot  be 
ignored.  Ignoring  correlation  of  the  interference  among 
samples  described  by  a  Gaussian  mixture  model 
causes  a  small  proportional  performance  loss.  In 
contrast,  the  correlation  of  signal  in  samples  usually 
needs  to  be  exploited  for  detection  and  often 
necessary  for  classification. 

There  are  two  general  approaches  to  combining 
energy.  One  approach  is  to  use  data  base  browsing 
techniques  to  identify  potential  detections  and  through 
operator  interaction  provide  the  capability  to  the 
operator  to  obtain  an  estimate  of  the  probability  that 
the  energies  in  the  operator-designated  track  would 
have  occurred  due  to  noise  alone.  A  technique 
similar  to  this  has  proved  effective  for  associating 
correlation  peaks  over  time  in  interarray  processing. 
The  other  general  approach  is  to  process  the 
beamformed  data  with  automatic  detection  and 
tracking  algorithms  and  use  the  algorithms  to  alert  the 
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operator  to  cases  of  interest.  In  this  approach,  the 
operator  then  examines  the  data  using  various  display 
options  to  confirm  its  interest. 

The  development  of  optimal  tracking  algorithms  for 
Gaussian  mixture  model  first  or  second-order 
detectors  remains  to  be  done.  At  the  present  time,  the 
properties  of  the  noise  at  the  outputs  of  the  detectors 
have  not  been  characterized.  Tracking  techniques 
using  Markov  models  of  the  transitional  probabilities 
relating  tf  }xt  spatial  cell  containing  the  signal  to  a 
finite  nu  of  previous  spatial  cells  containing  the 
signal  coc. ..  oe  investigated. 

A  preliminary  analysis  of  the  tracking  problem  has 
been  conducted.  The  results  are  presented  in 
appendix  C.  In  this  analysis,  likelihood  ratios  are 
assigned  to  tracks  made  up  of  segments  over  which 
samples  can  be  coherently  combined  and  models 
movement  from  one  segment  to  the  next  using 
random  walks  constrained  to  result  in  a  net  movement 
in  some  direction.  The  movement  in  the  chosen 
direction  is  analyzed  through  use  of  a  one-dimensional 
model.  The  one-dimensional  model  results  indicate 
positive  cell  signal-to-noise  ratios  for  the  segments 
before  tracking  provides  gain  over  single-segment 
detection.  In  a  general  context,  the  algorithms 
moderatly  improve  detection  over  those  provided  by 
the  individual  segments  and  allows  for  the  tracking  of 
targets  from  segment  to  segment,  which  aids  in  the 
classification  of  detected  targets. 

Summary. 

A  variety  of  traditional  and  recently  developed 
information  processing  techniques  have  applicability  to 
processing  the  beamformed  output  of  large  arrays  of 
hydrophones.  No  one  technique  is  the  best  for  all  of 
the  scenarios  which  may  be  of  interest.  The  signal  of 
interest  may  be  narrowband,  broadband,  or  broadband 
with  associated  narrowband  signals,  the  interference 
can  be  narrowband  or  broadband. 

Adaptive  locally  optimum  processing  can  be  used  to 
improve  detection  of  signals  masked  by  interference. 
The  adaptive  locally  optimum  processing  should  be 
used  between  a  time  domain  beamformer  and  spectral 
analysis.  This  allows  adaptive  locally  optimum 
processing  techniques  to  be  used  to  cancel 
narrowband  interferers  with  slowly  varying  frequency 
and  amplitude,  whenever  they  exceed  the  background 
by  6  dB  or  more.  In  addition,  the  Gaussian  mixture 


model  first-order  detector  can  be  used  to  handle 
nonstationary  interference.  These  techniques  can  be 
used  whether  the  desired  signal  is  broadband  or 
narrowband.  Also,  the  search  for  narrowband  lines 
with  or  without  associated  broadband  features,  can  be 
based  on  use  of  traditional  spectral  analysis  or 
recently  developed  time-frequency  domain  analysis 
techniques. 

The  adaptive  locally  optimum  processing  can  also  be 
used  after  a  frequency  domain  beamformer.  A 
general  theory  of  these  techniques  has  been 
presented.  In  particular,  we  have  shown  the 
Gaussian  mixture  model  second-order  detectors  to  be 
applicable  to  signals  received  with  constructive  and 
destructive  propagation  mode  interactions. 

The  processing  of  beamformer  outputs  by  different 
detection  algorithms  provides  different  informatbn 
about  the  signals  present,  narrowband  and  broadband. 
In  general,  it  may  be  helpful  to  an  informed  operator  to 
provide  the  capability  to  present  displays  of  the 
various  detection  algorithms  side  by  side,  and  possibly 
superimposed  using  different  colors  for  the  displays 
being  overlaid. 

The  output  of  a  matched  field  beamformer  consists  of 
outputs  for  three  spatial  dimensions,  a  time  dimension, 
and  a  frequency  dimension.  For  high-spatial  and/or 
high-frequency  resolution  beamformer  outputs,  it  will 
usually  be  necessary  to  combine  energy  from  different 
spatial  cells  over  time  to  allow  for  detection  of  weak 
signals.  The  algorithms  to  best  accomplish  this  need 
to  be  developed.  The  traditional  technique  of  eye 
integration  may  still  have  a  central  role  in  the 
multidimensional  case.  The  operator  could  be 
provided  with  database  browse  software,  which  would 
allow  for  the  display  of  beamformed  output  for  any 
surface  defined  in  combined  spatial,  temporal,  and 
frequency  space. 

Automated  classification  algorithms  could  be 
processing  the  outputs  of  the  beamformer  after 
detection  processing,  to  identify  signals  of  interest. 

The  operator  could  examine  the  thus  identified  signals 
of  potential  interest  using  the  browse  feature. 

The  information  processing  analysis  has  revealed  a 
central  role  for  adaptive  locally  optimum  processing  in 
ocean  basin  surveillance.  The  techniques  can 
substantially  improve  the  detection  of  weak  signals 
masked  by  other  signals. 


Amplitude  of  CW  =  10,  Noise:  zero  mean,  unit 
variance  for  both  components 


Noise -fCW 

Figure  3.  Corr^iex  sample  scatter  plots  for 

mixture  models. 
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MODEL 

CONSTRUCTION: 
EACH  SAMPLE  z,^ 

IS  A  MEMBER  OF  1 
AND  ONLY  1  STATE 


MODEL  USE: 
FUZZY  SET 
MEMBERSHIP  • 
SAMPLE  Zj  IN 

STATE  k  WITH 
PROBABIUTY 
w(j.k) 


Figure  4.  Fuzzy  set  interpretation  of  mixture  model  construction  process. 
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Figure  5.  Normalized  Bessel  function  and  the  Gaussian  distribution. 
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Figure  8.  Processirig  gain  upper  bound  contours  for  a  two-state  Gaussian  mixture 
model  first-order  detector. 


Figure  9.  Processing  gain  contours  for  a  two-state  Gaussian  mixture  model  first-order  detector. 
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Figure  12.  Processing  gain  contours  for  a  two-state  Gaussian  mixture  model  second-order 
detector. 


Figure  13.  Processing  gain  losses  due  to  fuzzy  set  membership  for  a  two-state  Gaussian 
mixture  model  second-order  detector. 
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HIGH  TO  LOW-STATE  VARIANCE  3  HIGH  TO  LOW-STATE  VARIANCE 


MIDDLE  TO  LOW-STATE  VARIANCE 
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Figure  16.  Deflection  comparison  for  two-state  Gaussian  mixture  model  second- 
order  detector  for  =  4. 

H 


Figure  1 7.  Processing  gain  comparison  for  two-state  Gaussian  mixture  model 
second-order  detector  for  cr^  =  4. 
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Figure  21.  Processing  gain  for  the  two-state  Gaussian  mixture  model  second-order 
detector. 


LOW-STATE  VARIANCE  ESTIMATE  a.  FUZZY  SET  LOSS 


Figure  23.  Parameter  sensitivity  for  a  two-state  Gaussian  mixture  model  fora 


LOW-STATE  VARIANCE  ESTIMATE  g  LOW-STATE  VARIANCE  ESTIMATE 
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Figure  27.  Implicit  model  processing  gain  for  a  two-stale  Gaussian  mixture  model 
second-order  detector  for  16  samples  and  high-state  variance  16. 
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Figure  28.  Implicit  model  deflection  values  for  a  two-state  Gaussian  mixture  modal 
second-order  detector  for  32  samples  and  high-state  variance  16. 


Figure  29.  Implicit  model  processing  gain  for  a  two-state  Gaussian  mixture  model 
second-order  detector  for  32  samples  and  high-state  variance  16. 
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PROCESSING  GAIN  (dB)  g  ^  DEFLECTION 


Igure  30.  Implicit  model  deflection  values  for  a  two-state  Gaussian  mixture  model 
cond-order  detector  for  64  sart^iles  and  high-state  variance  16. 


Figure  31.  Implicit  model  processing  gain  for  a  two-state  Gaussian  mixture  model 
second-order  detector  for  64  samples  and  high-stale  variance  16. 
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Figure  32.  Implicit  model  deflection  values  for  a  two-state  Gaussian  mixture  model 
second-order  detector  for  32  samples  and  high-state  variance  4. 


Figure  33.  Implicit  model  processing  gain  for  a  two-state  Gaussian  mixture  model 
second-order  detector  for  32  samples  and  high-state  variance  4. 
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PROCESSING  GAIN  (dB)  g  DEFLECTION 


figure  34.  Implicit  model  deflections  for  a  two-state  Gaussian  mixture  model 
Korid-order  detector. 


Figure  35.  Implicit  model  processing  gain  for  a  two-state  Gaussian  mixture  model 

second-order  detector. 


MODELINQ  LOSS 


Figure  36.  Implicit  modeling  loss  for  a  two-state  Gaussian  mixture  model  second-order  detector. 
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PROBABIUTY  OF  DETECTION  3  PROBABIUTY  OF  DETECTION 
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Figure  39.  Probabiity  of  detection  versus  deflection  for  the  two-state  Gaussian 

mixture  model  second-order  line  detector  (or  0^  =  6. 

H 


Figure  40.  Prr^ability  of  detection  versus  deflection  for  the  two-state  Gaussian 

mixture  model  second-order  line  detector  for  0^  =10. 
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PROBABILITY  OF  DETECTION  3  _  PROBABILITY  OF  DETECTION 


Figure  42.  Probability  of  detection  versus  deflection  for  the  two-state  Gaussian 

mbcture  model  second-order  line  detector  for  =6,  10,  and  16. 
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CL 


(a)  Signal  variance  to  low-state  variance  2 


(c)  Signal  variance  to  low-state  variance  0.5 


(d)  Signal  variance  to  low-state  variance  0.25 

Figure  44.  Receiver  operating  curves  fora  two-state  Gaussian  mixture  model 
second-order  line  detector  for  P^  =  0.25  and 
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(c)  Signal  variance  to  low-state  variance  0.5 
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APPENDIX  A 

Global  Processing  Gain  Bounds  for  Multistate  Gaussian  Mixture  Models 

In  this  appendix,  we  obtain  global  bounds  for  the  processing  gain  for  multistate  Gaussian  mixture  nnodel 
first-order  and  second-order  detectors.  The  bounds  are  obtained  by  relating  the  processing  gain  of  any  three  or 
tTW)re  state  Gaussian  mixture  model  to  the  processing  gain  of  a  constructed  Gaussian  mixture  model  with  fewer 
states. 


We  would  expect  the  processing  gain  achievable  for  a  muttistate  model  for  the  first-order  detector  to  deperKl  on 
the  low-state  and  high-state  variances  and  the  overall  noise  variance.  We  would  like  to  obtain  an  upper  bound  for 
the  processing  gain  subject  to  af  =  =  a^and  =  a^.  Moving  energy  from  a  middle  state  to 

outer  states  should  make  the  distribution  less  Gaussian-like  and  increase  processing  gain,  while  moving  energy 
from  the  outer  states  to  a  middle  state  should  decrease  processing  gain.  It  is  rather  easy  to  make  this  idea 
precise,  as  we  now  proceed  to  do.  This  suggests  that  the  maximal  processing  gain  is  achieved  when  the 
interference  is  modeled  by  a  two-state  model  with  all  the  energy  in  the  lowest  and  highest  states,  for  which 
processing  gain  is  determined  by  the  probability  of  the  lower  state  and  the  ratio  of  the  variances  of  the  states,  that 
is.  the  dynamic  range  of  the  interference  and  the  percentage  of  time  with  little  interference.  This,  in  turn,  provides 
a  model-free  criteria  for  when  it  is  worthwhile  to  use  adaptive  locally  optimum  processing  first-order  detectors 
rather  than  traditional  coherent  detection  processing. 


We  can  make  precise  the  argument  outlined  in  the  previous  paragraph  by  focusing  our  attention  on  any  three 
states  of  the  an  N  >  2  state  model.  Suppose  the  states  have  probabilities  pi,  pu,  and pn  with  variances 
csl<ali<ajf.  Letp=pL+PM+PH  =pLal+pMalf+pH(sjf.  Then,  introduce  the  normalized 

A  A 

probabilities  pi  =  ^,Pm  =  and pH  =  ^  and  the  normalized  variances  a  =  -^cl,  b  =  and 

p  P  P  0^02 


n  j 

c  =  ~Cu.  The  conditions  on  the  normalized  probabilities  define  a  probability  region  in  three-dimensional 
<S^ 

space  as  indicated  in  figure  A-1 .  The  probability  region  lies  in  the  plane  passing  through  the  three  points 
(1,  0,  0),  (0, 1,  0),  and  (0,  0. 1)  and  within  the  triangle  with  these  points  as  vertices.  The  constant  variance 

restraint,  api-^ bpM  +  cpn  =  1 .  defines  a  plane  passing  through  0, 0),  (0,  0),  and  (0, 0, ^).  Figure  A-1 


shows  the  case  when  b  >  1  and  the  intersection  of  the  constant  variance  plane  and  the  probability  region  is  a  line 
intersecting  sides  B  and  C  of  the  probability  region  triangle;  when  b  <  1 ,  their  intersection  would  result  in  a  line 
intersecting  sides  A  and  B  of  the  probability  region  triangle.  The  lines  of  intersection  can  be  parametrized  by  the 
low-state  normalized  probability  by  using,  in  succession,  two  substitutions:  (1)  pm  -  pi- Ph  and  (2) 


Pnic  —  b)  =i\—b)  -piio-b),  which  is  simply  rewriting  the  sum  of  the  state  variances  times  their  normalized 


probabilities.  Equation  (2)  can  be  rewritten  as  pi{b  -  a)  =  pnic  -b)  +  ib-\)>b  -  1  because  the  normalized 

probabilities  are  non-negative.  This  leads  to  p/,  >  0  for  the  geometry  of  case  b  <  1  and  pi  ^  for  the 

geometryof  case  b>  \  .  The  processing  gain  gi(p)  is  a  simple  function  of  the  low-state  probability  restricted  to 
the  line  of  intersection  for  bo;n  geometries.  It  is  described  by  a  line  with  positive  slope. 


Rewrite  the  upper  bound  for  processing  gain  as  follows 

lLPk^~P^^{PL^-^PM^+PH^)+  Z  Pki^) 

t=l  ^  remaining  stales 


A-1 
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Figure  A-1.  Processing  gain  parametrized  by  low-state  probability. 
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As  for  the  first-order  detector,  we  would  expect  the  upper  txjund  for  the  processing  gain  achievable  for  a 
multistate  model  second-order  detector  to  depend  on  the  low-state  and  high-state  variances  and  the  overall  noise 
variances  just  as  it  did  for  the  signal  of  known  structure.  Once  again,  we  would  like  to  obtain  upper  and  lower 

bounds  on  giipx, a?,  ...,ps, «!)  subject  to  ai  =  a^,  =  aj,,  and  X As  before,  moving 

*=i 

energy  from  a  middle  state  to  the  outer  states  should  lead  to  a  less  Gaussian-like  distribution  and  thus  should 
increase  processing  gain,  while  moving  energy  from  the  outer  states  to  a  middle  state  leads  to  a  more 
Gaussian-like  distribution  and  thus  decreases  the  processing  gain.  For  the  case  of  a  signal  of  known  structure, 
the  upper  bound  for  processing  gain  is  a  linear  function  of  positive  slope  as  a  function  of  the  low-state  probability 
given  the  constraints.  For  the  case  of  a  signal  of  unknown  structure,  the  square  of  the  upper  bound  for 
processing  gain  g2(p)  turns  out  to  be  a  parabolic  function  opening  upwards  as  a  function  of  the  low-state 
probability  (see  figure  A-1). 


As  before,  we  focus  our  attention  on  any  three  states  of  the  an  N  >  2  state  model.  Suppose  the  states  have 
probabilities  pi,,  pM,  and pn  with  variances  .  Let 


P-Pl  +PAr+/?/r.o" 


■■pL<5l+PM<5l+PH<sl,  =  = 

P  P 


P  2 

T-aA^,andc  = 


BE. 

A  « 

P 


The  conditions  on  the  normalized  probabilities  define  a  probability  region  in  three  space  as  irwlicated  in  figure  A-1. 
Figure  A-1  shows  the  case  when  b  >  1,  and  the  intersection  of  the  constant  variance  plane  and  the  probability 
region  is  a  line  intersecting  sides  B  and  C  of  the  probability  region  triangle;  when  b  <  1 ,  their  intersection  would 
result  in  a  line  intersecting  sides  A  and  B  of  the  probability  region  triangle.  The  lines  of  intersection  can  be 
parametrized  by  the  low-state  normalized  probability  by  using  in  succession  two  substitutions: 

0) pM=  Pl- Ph  and  (2)  Ph(c-6)  =  (1 -6)-p/,(a-Z?).  Equation  (2)  can  be  rewritten  as 
Piib-a)  =pH{.c~h)  +  i.b-  1)  >  />  -  1 .  This  leads  to  >  0  for  the  geometry  of  case  b  <  1  and 

Pl  ^  T — -  for  the  geometry  of  case  b>  \  . 
b  —  a 

For  the  first  factor  of  the  square  of  the  upper  bound  for  processing  gain 


^  n  1  A  I  A  j.  .  Cr  A  . 

'p=l  Oit  O  “  ^  remaining  suites 

+  I  pA) 

O  u  1/  t/  L.  u  remaining  states  CT*. 
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=  ^3£I 


1 


ipLC^{b-  -a^)  +  a^c^  +pHa^ib^  -c^))+  ^ 


remaining  states 


n 


4  ^ 

=P^T7 — r;^:7-r(pL(6  -  a)(c  -  a)(Ac  +  Z)a  +  ac)  +  (/)  + c)(/j  -  1))+  S  /3*(-^) 

o'*  a^b^c^  — - -  " 


remaining  stales 


=  p3i^ 


1 


fl2/)2c2 


l/7(Z>c +ba  +  ac)  +  +  a^(b  +  c)(b  -  1))  +  D] , 


where  p  =  ib-  a)(c  -  a)pL  and  D  =  (p3 


u  e/  i,  remaining  slates  O  ^ 


For  the  second  factor  of  the  square  of  the  upper  bound  for  processing  gain 
2T,pk^-'' =2[-^(pia^+PMb^+PHC^)+  L 

*=1  On  pOn  remaining  states  On 

A  4  4 

=  2[-^{pL{a^-b^)  +  b^+PHic^-b^)}+  E  P*(-^)]-l 

pan  remain  ing  slates  On 

a* 

=  ^—[2{p  +  b+c-bc}+E-F], 
pan 

^4  ^4  ^4 

where  p  is  as  above,  E  =  2(-;7-^)“'  X  Pki—j),  and  F  =  (7—^)"'  . 

pah  remaining  staus  On  pan 


In  preparation  for  calculations  to  follow,  we  need  2  +  £  -  F  >  0 .  Note  that 

ct'*  ct'*  ^  ry'* 

2(7—^)  +  2  X  F*(^)-1^2X  £*(-7)  -1^1  because  p  <  1  and 

pah  remaining  slates  an  On 

^Pk^t  -  (XF*<r*)^  =  by  the  Cauchy-Schwartz  inequality. 

The  coefficients  of  the  normalized  low-state  probability  are  obviously  positive  for  both  factors  of  the  upper  bound 
for  processing  gain  so  that  the  upper  bound  for  the  processing  gain  squared  is  a  parabola  opening  upwards.  It  is 
then  clear  that  the  maximum  of  the  upper  bound  for  processing  gain  as  a  function  of  state  probabilities  occurs 
when  one  of  the  probabilities  is  zero.  It  follows  that  an  upper  bound  for  an  N  state  model  is  less  than  that  for  an 
N-1  state  model,  and  thus  by  induction  for  a  two-state  model.  However,  more  is  true,  namely,  the  parabola  is 
opening  up  over  the  allowable  values  of  p  so  that  the  upper  bound  for  processing  gain  is  everywhere  an 
increasing  function  of  the  low-state  probability  for  N  >  2.  We  proceed  to  prove  this. 


Suppose  that  b<\.  To  show  that  the  critical  point  of  the  parabola  occurs  when  p  <  0,  it  suffices  to  show  that  the 
coefficient  of  p  in  the  product  of  the  two  factors  is  positive  because  the  coefficient  of  p^  is  positive.  To  conclude 
this,  observe  that  the  coefficient  of  p  is  a  quadratic  polynomial  in  c  with  the  coefficient  of  given  by 
2a^  •f2(l  -b){b  +  a)  >  0  and  that  of  c  given  by  0-i-a)[2/>-»-F-F-i-2(/)-  l)a].  We  proceed  by  showing 
that  the  critical  point  of  the  parabola  in  t  is  <  1  and  showing  that  the  coefficient  p  >  0  when  c  =  1 .  That  the 
critical  point  is  as  desired  follows  from  -1-4(1- b){b  +  a)>ib  +  a){-2b -E+F+2i\- b)a}  using 
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a<b<  \  and  l  +  E-FtQ.  For  c  =  1 ,  the  coefficient  of  p  is  positive  follows  from 

(2  +  £  -  £)(fl  +  6  +  ab)  +  2a^b^  +  2D  >  0 .  which  can  easily  be  established  by  directly  evaluating  the  coefficient 
of  p  in  the  product  of  the  factors  before  expressing  the  coefficient  as  a  quadratic  in  c. 


Suppose  that  6  >  1 .  In  this  case,  it  is  convenient  to  introduce  p  defined  hy  p  =p  +  {b  -  \  ){c  -  a)  so  that  the 
lower  bound  on  p  can  be  replaced  by  ^  >  0.  To  show  that  the  critical  point  of  the  parabola  occurs  when  p  <  0 , 
it  suffices  to  show  that  the  coefficient  of  p  in  the  product  of  the  two  factors  is  positive  because,  as  before,  the 
coefficient  of  (which  is  the  same  as  that  of p^)  is  positive.  To  do  this  we  observe  that  the  coefficient  of  p  is 
a  quadratic  in  c  with  the  coefficient  of  given  by  2a^  +  2(6  -  1  )(^  +  a)  >  0  and  the  coefficient  of  c  given  by 
(b  +  a)[2{b  +  a  —  ab)  +  £-£].  Since  the  coefficient  of  is  positive,  it  suffices  to  show  that  the  critical  point  for 
the  parabola  in  c  is  <  6 and  showing  that  the  coefficient  of p  is  positive  when  c  =  b.  That  the  critical  point  is 
as  desired  follows  from  2b{2a^  +  2(b  —  1  )(h  +  a)}  >  — (6  +  a)[2{b  +  a  —  ab)  +  E  —  F],  which  is  equivalent  to 
Aba}  +  (4  —  2a){b  - 1 )(/)  +  a)  +  (6  +  a)i2b  +  £-£)>  0 ,  which  is  clearly  true  because 
2b+E—F>2  +  E—F^Q.  It  is  easy  by  direct  calculation  to  evaluate  the  coefficient  of  p  when  c  =  b  and 
show  that  it  is  positive  by  using  the  same  inequality  as  in  the  previous  sentence. 


The  argument  presented  in  the  last  paragraph  shows  that  given  any  N-state  model  with  N  >  2,  it  is  always 
possible  to  construct  a  two-state  model  a,  =  al,  =  a^,  and  ptoj  =  such  that 

S2(Pi,o.\,  ...,Pn,o.^)  <  g2(Pi,a,,p2,(X2).  A  global  bound  for  any  N-state  mixture  model  of  two  or  more 
states  is  obtained  by  observing  that  the  constraints  completely  define  the  two-state  model  obtained  by  the 
construction  process.  The  probabilities  of  the  two  states  can  be  calculated  from  po^  +  ( 1  - p)(yl  =  .  Solving 

this  equation  for  p  and  1  - p  yields  p  - - -  - - —  ^2  _l  n  — 


- - f-andl-/,= 


2  2 

^H-^L 


Then  p{^  )2  -t-  ( 1  -p){-^  )^  = 

p2 


1  -[P  +  (1  -£)]  +  IP(^  +  (1  -p){\  )2]  =  ?^-]((gi)2  _1)  +[5  £L]((.gi)2  _i) 

“  ^  +  (  "4  4  ~  +  al) ,  provides  a  formula  for  the  first  factor,  while  by  easy  algebra  the 

a^ 

second  factor  can  be  written  as2[p(— 7  )^  +  (l  -p)(— r  )^]  -  1  =  1  +  — (a^  -o?)(a?,-a^). 

'  <j2  04  '  n  ' 

These  expressions  exhibit  the  upper  bound  for  processing  gain  for  a  nontrivial  two-state  model  as  positive 
a^  CT^ 

Letting  a  =  — ^  and  b  =  it  follows  that 
a^  a^ 


gia,  b)  =  +  ( 1  - a){b  -  1  )(c/  -1-  />)][  1+  2(  1  -  a){b  -  1 )]  , 

which  shows  how  the  two-state  processing  gain  upper  bound  depends  on  the  normalized  variances. 
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APPENDIX  B 

Processing  Gain  Bounds  for  the  Middleton  Class  A  Noise  Model 


In  this  section,  we  derive  processing  gain  bounds  for  the  Middleton  Class  A  noise  model.  To  make  this  appendix 
self  contained,  we  recall  a  few  basic  facts  about  the  Middleton  class  A  noise  model.  Middleton's  class  A  r>oise 
model  is  a  Gaussian  mixture  model  with  an  infinite  number  of  terms.  Middleton  has  also  introduced  noise 
models  of  class  B  and  C.  These  models  describe  noise  for  which  the  individual  noise  sources  are  received  at 
certain  times  and  not  at  others.  A  noise  impulse  is  categorized  as  class  A  if  it  produces  a  decaying  response 
from  a  receiver,  the  case  for  undersea  surveillance  applications,  and  class  B  if  it  produces  ringing  of  the  receiver, 
and  class  C  if  sometimes  it  causes  ringing  and  other  times  not,  often  the  case  for  communications  receivers. 

Middleton  (1977,1983)  and  Middleton  and  Spaulding  (1983)  derive  univariate  probability  density  functions  for 
class  A  and  class  B  noise,  by  supposing  the  noise  consists  of  Gaussian  background  noise  and  interference  form 
discrete  sources  which  are  Poisson-distributed  in  space.  (References  in  this  appendix  are  listed  at  the  end  of  the 

1  ~  1  A" 

body.)  The  density  function  that  they  obtained  for  class  A  noise  is  given  by  p(x)  =  -j=-  X 

We  assume  that  CTq  =  <J^„  is  the  variance  of  the  background  noise.  This  probability  density  function  is  formulated 
for  real  quantities  and  we  wish  to  use  the  noise  model  to  describe  complex  samples.  To  do  this,  we  introduce 

the  spherically  invariant  probability  density  function  p{\z\)  =  -;r-  2-  interpret  the 

m=0  OiB 

parameters  of  the  model  as  applying  to  the  norms  of  the  samples  in  place  of  the  real  quantities  of  the  original 
formulation.  In  other  words,  we  use  the  Middleton  class  A  noise  model  to  describe  received  interference  power. 


We  provide  a  physical  interpretation  of  the  Middleton  class  A  noise  model  by  assuming  that  each  sample  belongs 
to  a  single  state.  The  states  for  the  Middleton  class  A  noise  model  are  then  determined  by  the  number  m  of 
discrete  interference  sources  active  at  any  given  time.  In  the  equation  for  the  probability  of  the  being  in  the  m-th 

state  is  PmiA)  =e  — r.  This  is  the  probability  that  interference  generated  by  m  discrete  sources  is  received. 
ml 

The  m  =  0  state  describes  the  Gaussian  background  noise  level  when  the  background  noise  does  not  contain 
interference  from  any  of  the  discrete  sources.  The  expected  number  of  discrete  sources  for  which  interference  is 
received  is  A.  Middleton  calls  A  the  overlap  index,  which  he  defines  asA=  rD,  where  r  is  the  expected  number 
of  sources  whose  interference  is  received  per  unit  time,  and  D  is  the  expected  duration  of  such  a  reception.  For 


this  model,  the  background  noise  has  variance  Gq  and  the  expected  received  noise  plus  interference  power  for 

the  m-th  state  is  given  by  =  On  +  with  -^the  expected  power  received  from  each  interferer.  It 

A  A 

follows  that  the  overall  noise  variance  for  the  model  is  o^  =  Oq  +  . 


The  processing  gain  upper  bounds  obtained  for  Gaussian  mixture  models  for  signals  of  known  and  unknown 
structure  can  be  extended  to  Middleton's  class  A  model  by  an  easy  limit  argument.  Let 
o2  + 


CTffi 


r+\ 


=  Rm{A,r)-  Then  the  processing  gain  upper  bound  for  the  Middleton  class  A  noise 


nrwxlel  and  signal  of  known  structure  is  gi (/4, F)  =  X  Pm{A)Rm{A,r) ,  while  the  processing  gain  upper  bound 

m=0 

for  the  Middleton  class  A  noise  model  and  signal  of  unknown  structure  is 


g2(A,r)=  iEp>n(A)Ri(A,rm2Z 

\ 


Pm(A) 

mMI  Rm(A,r) 


-I]. 
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We  now  show  that  as  A  and  F  tend  to  infinity,  both  the  first-order  and  second-order  gains  tend  to  1 .  In  other 

words,  there  is  virtually  no  gain  over  ciassical  processing  when  or  F  is  large.  We  first  consider  the  case  of 

gi(y4,F)  and  show  that  limgi(y4,F)=  1  and  limgi(y4,F)  =  1. 

r-»~ 


To  prove  the  first  limit,  we  use  Chebychev's  Inequality  (Guttman,  Wilks,  and  Hunter,  1971,  p.94)  and  some  basic 
properties  of  the  Poisson  distribution  (Guttman  et  al.,1971,  p.116).  Recall  that  Chebychev's  Inequality  states  that 
for  any  probability  density  function  p  with  mean  |i  and  standard  deviation  a,  and  any  k>0, 

P(|A'-|x|  >ka)<-^.  The  Poisson  distribution  with  parameter is  given  by  p(/n)  =  m  =  0,l,2,... 

The  mean  and  the  variance  of  the  Poisson  distribution  both  have  value  A .  Let  X,(y4)  be  a  positive  function  of  A 

such  that  =  0  and  lim-^  =  0.  Examples  of  such  X(y4)'s  are  A  ^  and  7^.  Then  using  Chebychev's 

A-t»o  Ja  In  a 


Inequality, 


\m-A\>UA)jA 

F  >  0 ,  and  using  the  fact  that 
1 


m=0 


(1- 


:) 


(i-t-rM 

m-i-y^F 

(l+TM 


is  a  decreasing  function  of  m , 


r 


^HA)  a+ua)Ja +Ar 


<gdA,r)^ 


(1+rM 


{A-XiA)yPi  +Ar  ) 


1  (1 -1-0/1 
XHA)  AF 


Now  observe  from  the  defining  properties  of  XiA)  that  the  first  and  last  terms  of  the  above  inequality  tend  to  1  as 
^  00  to  conclude  that  limgi  (/4,  F)  =  1 .  Note  that  in  evaluating  limgi  (^4,  F) ,  the  limit  cannot  be  taken  inside 

the  sum  since  =  0  for  each  m  and  the  sum  would  then  be  0. 

w!  m+AV 


(1±IM  =  1  and  <1±IM  =  1  _ 


We  next  prove  that  limpi  (^,  F)  =  1 .  Observe  that  lim - 7=7*  =  1  and - =  1 - TTF- 

r-»~*  r-^  m+Ai  m+AT  m+AF 

first  equality,  observe  that  limgi  (^4,  F)  =  1  is  true  if  we  formally  interchange  the  sum  and  the  limit.  However,  as 


we  noted  previously,  one  must  be  careful  that  the  operation  of  interchanging  the  sum  and  the  limit  is  justified. 
From  the  second  equality,  observe  that  is  an  increasing  function  of  F  when  m>  A.  Therefore, 


limgi(i4,F)=  lim  ^  e 


m+AF 

-,,.4"(l+rM 


r->- 


r-»~ 


m%A 


m\  m+AF 


+  lim  2  e''* 


m\  m\ 


.-A  A' 


r-»- 


m>A 


m\  m+AF 


Note 


m>A 


that  the  first  limit  on  the  right  converges  since  there  are  only  a  finite  number  of  terms  in  the  sum  and  the  second 
limit  converges  by  the  Monotone  Convergence  Theorem  (Halmos, 1957, 103,  p.  112,  Theorem  B). 


Consider  the  gain  for  the  second-order  detector.  Recall  that 
g2(.4,F)  = 


-AA'”Al  +  nA,2 


TfM) 


^  m!  (1-I-F)y4 


m=0  "*•  (l+O'^ 

-aA”^-..-^  ^  ^-aA” 


ml  m+AF 

From  the  properties  of  the  Poisson  distribution,  X  e~'*^^m  =  A  and  X  e~'*^^(m  -A)^  =  A. 

"  ^  nr 

^  A  ” 

Therefore,  X  — r"* ^  =A^+ A. 

-  ml 


iiM) 


»t=0 


Then 


y  .-^A:L(Jli±AIi)2  -  1 

^  m!  Ml-t-FM^  ~  (l-t-F)2 


m=0 


m=0 


upO 
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(l+r)2v4  ■ 

Thus, 

g2(^r)  = 

Cleariy,  the  second  term  inside  the  radical  tends  to  1  as  /4  or  F  — » <».  The  first  term  inside  the 

radical  can  be  shown  to  tend  to  1  as  ^4  or  F  — >  «<»  in  exactly  the  same  way  as  for  (,«4,  F)  and  we  omit  the 

details. 

We  now  determine  the  numbers  of  terms  needed  to  obtain  good  approximations  of  g\  (..4,  F)  and  g2iA,  F) .  The 
result  that  we  obtain  states  that  for  any  fixed  A  and  any  F,  using  2A  + 1  terms  in  the  sum  will  give  very  good 
approximations. 

We  first  consider  (v4,  F) .  Recall  that  Stirling's  formula  states  that 
^ (f )"  < n’.  <  ^ (f )"  forn  =  1 , 2, .... 


Let  >  0  be  given.  Let  ^  ^  if  /4  is  an  integer  and  let  [v4]  + 1  otherwise.  We  have  for  m  >  that 

dZ  -  ‘2N  ^  £L,_m-)JE.,eN.,jL  when  m  =  2N 

m!  N'.  ml.  N<.  ‘  12)V- 1 '  ,/2S?(i)"  ^  h2W- 1 '  '  e"  ' 

^  -1^ = s  ^  s  I 

V  m  / 

„>2N  dl^jZ(.J2N 

’  m!  AA.  M2A^- r  ^  2  ^  ^2^ 

Hence  £ 

m=2A’+i  m+Ar  N]  N+Ar  12N-1  2  ^2N+i  Jm  2 

^  AN{\+T)A  \2N  1 

”  M  N+Ar  ^  12N-  2  ’  J2 


^  A^{\+T)A  \2N  1 

”  M  N+Ar  ^  12N-  2  ’  B 


y  A” 

(1+FM 

_y 

(1+r)^ 

oe 

_  y 

^m 

(1+0^4 

ni\ 

m+Ar 

m=0  fn\ 

m+Ar 

-  Zj 

m=2W+l 

m\ 

m+Ar 

^  Af^{\+r)A  \2N  1  <  (  \2N  1  y  A”  (l+TM 

"  A/!  iV+y4F  M2A^-r^  J2  ~  ^12A^-1  ^  J2  ^  ' 

Therefore, 

y 

wil  fn+AP  *  ^  f  ^ 

\-— -  <  )(-^)^^  and  by  the  Mean  Value  Theorem 

y  \2N- 1  2 

“  ml  m-t-AF 
m=C 
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yt"  (l-i-ru 

ml  m+AF 

A-  (l+HA 
ml  m+,<r 

For  example,  a  31-term  approximation  is  within  0.01  dB  for  0  <^  <  15,  and  for  60  terms,  the  approximation  is 
within  0.00003  dB  for  0  <./4  <  30.  Note  that  the  atx}ve  upper  bound  is  very  conservative  since  we  have  shown 
that  the  tail  of  the  series  is  much  less  than  the  N-th  term  alone. 

Consider  g2(^,  F) .  Recall  that  g2(A,  O  = 


Using  the  exact  same  computation  as  before, 

.n 


_ 1  (  \2N 

~  In(lO)  yj  2  ’ 


Figures  B-1  and  B-2  present  processing  gain  of  first-order  and  second-order  detectors  for  the  Middleton  class  A 
noise  model.  Each  figure  contains  a  three-dimensional  plot  of  the  gain  in  decibels  and  a  contour  plot  of  the  gain 
in  decibels.  The  plots  were  generated  based  on  the  above  considerations  by  using  the  first  31  terms  of  the 
infinite  sums  in  gi  (y4,  n  andg2(^,r)  for  1  <y4  <  16  arKiO.Ol  <r  <0.3, The  plots  indicate  significant  gain  for 
small  A  and  low  background  noise  as  expected. 


A  natural  two-state  model  to  approximate  the  Middleton  class  A  noise  model  has  low  state  variance  equal  to  (Tq 
and  low-state  probability  e~'* .  The  high-state  variance  is  then  determined  by  enforcing  the  condition  that 

=  Oq  +  Q  =  ole'"*  +  a^(  1  which  implies  that  ai  =  ^  ■ 

(1  c  ) 

Note  that  for  small  A,  I  -  e~'*  =  A  and  the  above  is  close  to  the  high  state  variance  obtained  for  the 
two-state  approximation  of  the  Middleton  class  A  noise  model  described  in  appendix  A  to  the  companion  report 
"Gaussian  Mixture  Models  for  Acoustic  Interference". 


For  this  two-state  model  and  signal  of  known  structure,  an  upper  bound  for  the  processing  gain  is 

g,(.4,r) 

For  this  two-state  model  and  signal  of  unknown  structure,  an  upper  bound  for  the  processing  gain  is 
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Figure  B-1.  Processing  gain  of  a  first-order  detector  for  Middleton  class  A  noise  model. 


Figure  B-2.  Processing  gain  of  a  second-order  detector  for  Middleton  class  A  noise  model. 
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g2(A.D=  ^MA.riaMA.r)-})  with 


1  l+r(l-e-'‘) 

and 


/■2(.4,r)  =  e-'<(-y^)2 


1  l+ni-e-'^), 

{i-e-'^y  i+r  ^ 


A  natural  three-state  model  to  approximate  the  Middleton  class  A  noise  model  has  low-state,  middle-state,  and 
high-state  probabilities  and  variances 


e  '*,al,Ae~^,<yl  +  ^,  and  1  -e~'^-Ae~'*,aQ  + 


\-e 


,-A 


-Ae~' 


-Q. 


For  this  three-state  model  and  signal  of  known  structure,  an  upper  bound  for  the  processing  gain  is 


h2iA,r)  =  e  +  (1  -Ae-'^  - 

1  1  +y4I  (l—e~‘ 


1+r 


). 


For  this  three-state  model  and  signal  of  unknown  structure,  an  upper  bound  for  the  processing  gain  is 
h2(A, n  =  Jfi(A,r)i2f2iA,r)-l)  with 


MA,D  =  +  (1  ^ 


1+r 


and 


{\-e-'^)  +  ri\-Ae-^-e-'^)' 


Figures  B-3  and  B-4  compare  the  processing  gain  bourKjs  obtained  for  the  first-order  and  second-order  detectors, 
respectively,  for  the  two-state  and  three-state  approximations  to  the  Middleton  Class  A  noise  model  and  the  the 
31 -term  approximation  of  the  Middleton  Class  A  noise  model.  The  comparisons  are  contour  plots  for  abscissa  A 
and  ordinate  F  satisfying  0.1  <^4  <  20  and  0.01  <  F  ^0.5.  The  contours  are  for  differences  between  the 
processing  gain  bounds  in  decibels.  For  the  range  of  mixture  model  parameters  surveyed,  the  upper  bounds 
were  very  close  for  a  first-order  detector,  less  than  0.8  dB  for  the  two-state  model  and  0.6  dB  for  the  three-state 
model,  and  a  comparison  of  the  plots  indicates  further  that  they  never  differed  by  more  than  0.2  dB  gain.  The 
same  comparison  for  the  second-order  detectors,  indicate  slightly  larger  maximum  differences  for  the  two-state 
and  three-state  rrKXiels  processing  gain  bounds  than  exhibited  for  the  first-order  detectors,  a  maximum  of  1 .4  dB 
for  a  two-state  model,  a  maximum  of  0.8  dB  for  a  three-state  model,  and  a  gain  of  up  to  0.6  dB  for  using  a 
three-state  nKXiel  instead  of  a  two-state  model.  To  the  extent  that  the  upper  bounds  for  processing  gain 
estimate  the  achievable  processing  gains,  these  results  support  placing  emphasis  on  fitting  actual  data  with 
two-state  and  three-state  mixture  models. 
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(a)  a  two-state 

Note:  The  probability  and  variance  of  the  high  state 
in  each  finite  state  model  are  adjusted  to  keep  the 
overall  variance  constant  Range  of  plots: 

0.01  < r< O.S,  0.1  <  A  <20. 


(b)  vs  a  three-state 

Figure  B-3.  Comparison  of  processing  gains  of 
the  Middleton  class  A  first-order  detector  and  finite 
state  approximations. 
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(a)  vs  a  two-state 

Note:  The  probability  and  variance  of  the  high  state 
in  each  finite  state  model  are  adjusted  to  keep  the 
overall  variance  constant  Range  of  plots: 

0.01  <r< 0.6.  0.1<A<20. 


(b)  vs  a  three-state 

Figure  B-4.  Comparison  of  processing  gains  of 

the  Middleton  class  A  second-order  detector  and 
finite  state  approximations. 
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APPENDIX  C 

Performance  Loss  Due  to  Target  Motion 


Introduction. 

Detecting  quiet  targets  in  a  surveillance  system  typically  requires  processing  that  integrates  beamformer  outputs 
over  many  samples  in  order  to  accumulate  sufficient  signal  energy  to  detect.  This  integration  often  requires 
tracking  for  large  aperture  arrays  with  fine  spatial  resolution,  because  a  rtKiving  target  could  travel  through 
multiple  spatial  cells  during  the  integration  time  required  for  detectbn.  If  the  target's  velocity  is  unknown,  as  it 
generally  is  before  detection,  the  integration  time  has  to  be  limited  to  the  length  of  time  a  target  is  likely  to  remain 
in  a  single  cell  unless  tracking  algorithms  or  eye  integration  is  used. 

Figure  C-1  illustrates  the  need  for  combining  spatial  energy  for  different  spatial  cells  to  achieve  detection  In  a 
scenario  similar  to  some  modeled  by  the  HGI  simulator.  Figure  C-1  displays  the  cumulative  probability 
distribution  for  the  number  of  spatial  cell  boundaries  crossed  by  a  randomly  chosen  target  track  over  one-half 
hour.  The  target's  initial  position  has  a  uniform  random  distribution  on  a  disk  of  radius  1000  km,  with  the  spatial 
cells  being  1 -degree  by  2-km  annular  sectors,  and  the  velocity  distribution  is  uniform  on  a  disk  of  radius  28  km/hr. 
Depth  and  frequency  are  fixed  and  known.  Note  that  in  this  example  very  few  targets  remain  in  one  cell,  and 
about  half  of  them  cross  at  least  three  cell  boundaries.  This  example  suggests  the  need  for  combining  energy  for 
multiple  spatial  cells  for  large  surveillance  arrays  emptoying  matched  field  beamforming  or  similar  beamforming. 

The  purpose  of  thi^  appendix  is  to  present  analysis  and  simulation  results  showing  the  extent  to  which  unknown 
target  motion  degrades  achievable  detection  performance  in  large  surveillance  arrays. 

One  approach  to  combining  energy  in  different  spatial  cells  is  track-before-detect.  References  to  examples 
appear  in  the  next  few  paragraphs.  In  general,  a  tracking  detector  hypothesizes  various  possible  target  motions 
and  evaluates  a  detection  statistic  conditioned  on  each  target  motion  under  consideration.  When  the  statistic  falls 
in  a  critical  region,  both  the  existence  of  the  target  and  its  velocity  are  reported.  Since  the  number  of  possible 
target  motions  is  huge,  tracking  detector  implementations  require  a  great  deal  of  processing  power.  Algorithms 
based  on  similar  signal  and  noise  models  are  mainly  distinguished  by  the  shortcuts  and  approximations  required 
to  reduce  the  processing  loads  on  the  computers  used  to  implement  the  algorithms. 

Tracking  detector  implementations  can  be  grouped  in  two  broad  categories;  track  initiation  techniques  and 
probability  map  techniques.  The  former  compute  detection  statistics  for  potential  tracks,  which  are  initiated  by 
energy  peaks  crossing  a  low  threshold  and  are  terminated  when  the  track  fails  some  kind  of  M-out-of-N  rule 
designed  to  monitor  persistence  of  signal  power.  Examples  include  Blostein  and  Huang  (1991),  Bruton  and 
Bartley  (1986),  Nagarajan,  Sharma,  and  Chidambara  (1984),  Shensa  and  Broman  (1985),  and  Broman  (1992). 
(References  for  this  appendix  are  listed  at  the  end  of  the  body  of  the  report)  These  track  initiation  approaches 
are  appropriate  for  low  dimensional  problems,  but  they  may  not  scale  up  to  the  dimensions  of  HGI  beamformer 
outputs.  They  generally  employ  complex  decision  logic  and  require  a  good  deal  of  tuning  and  adjustment  of 
error-handling  heuristics  in  order  to  achieve  good  performance.  See  for  examples  Gibbons  et  al.  (1987)  and 
Struzinski  (1978)  and  the  articles  referred  to  therein. 

Probability  map  techniques  form  detection  statistics  for  a  large  number  of  designated  possible  target  motions, 
without  picking  and  choosing  the  motions  based  on  observed  signal  energies.  This  approach  uses  simpler  logic, 
but  can  require  more  processing  power  than  track  initiation  techniques.  On  the  other  hand,  the  processing 
required  is  highly  parallelizable  because  of  its  simplicity.  The  dynamic  programming  implementations  in  Kessler 
ei  al.  (1988)  and  Barniv  (1985),  the  bounded  hop  pixel  statistic  in  Wei,  Zeidler,  and  Ku  (1992,1993),  and  the 
random  walk  approach  introduced  in  this  appendix,  are  examples  of  probability  map  schemes.  Other  comparable 
approaches  are  found  in  Mohanty  (1981),  Chen  (1989),  Cowart.  Snyder,  and  Ruedger  (1983)  and  Porat  and 
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Friedlander  (1990).  Most  pattern-matching  and  interactive  approaches  are  also  in  this  category,  but  are  not 
considered  because  their  performance  depends  on  the  details  of  the  beamformer. 

The  State  Space. 

We  assume  that  the  inputs  to  the  tracking  detector  are  beamformed,  matched  field  processors,  Fourier 
transformed  and  normalized  by  some  kind  of  noise  spectrum  equalizer  or  by  an  adaptive  locally  optimum 
processing  algorithm.  Accordingly,  at  each  time  step  the  detector  receives  an  array  of  spatial  cells  that  may 
consist  of  noise  or  signal-plus-noise  measurements.  The  spatial  cells  are  parametrized  by  frequency,  discrete 
target  range  from  the  array,  bearing,  and  depth.  Generally,  the  particular  coordinate  system  used  for  this  state 
space  depends  on  the  beamforming,  but  a  state  space  having  dimension  of  at  least  4  is  likely  for  an  HGI  array. 

The  question  arises  whether  the  dimension,  or  the  precise  form  of  the  coordinate  system  of  the  state  space  is 
important  for  evaluating  the  effect  of  target  motion  on  detection?  Intuition  suggests  that  the  principal  factor 
influencing  the  performance  of  most  tracking  detectors  is  the  amount  of  noise  introduced,  instead  of  signal  plus 
noise,  due  to  the  inclusion  of  measurements  from  state  space  cells  not  containing  the  target.  Uncertainties  about 
the  true  target  path  bring  in  more  noise-only  measurements  in  a  higher  dimensional  space  than  in  a  lower 
dimensional  one  because  each  spatial  cell  has  more  neighbors.  We  suppose  that  the  driving  factor  is  the  number 
of  extraneous  noise-only  measurements  introduced  by  path  uncertainties,  and  only  secondarily  the  dimension  of 
the  space  in  which  tracking  is  performed.  Accordingly,  we  focus  on  the  number  of  cells  involved,  and  for 
simplicity's  sake  perform  simulations  in  one  dimension. 

Signal  Model. 

The  nrKxjel  chosen  for  the  narrowband  signal  and  the  noise  for  our  simulations  is  chosen  to  illuminate  essential 
characteristics  of  the  track-before-detect  problem.  The  results  obtained  are  not  expected  to  depend  strongly  on 
model  details.  In  particular,  the  structure  of  the  tracking  detection  algorithm  presented  is  applicable  to 
measurements  for  which  a  likelihood  ratio  can  be  computed.  In  particular,  we  suppose  that  the  measured  signal 
and  noise  is  sampled  and  reported  at  intervals  of  TIN  seconds,  so  that  for  each  spatial  cell,  a  window  of  T 
seconds  of  data  can  be  input  into  an  A^-point  real  Fourier  transform. 

For  simplicity's  sake,  we  assume  first  that  signal  energy  from  each  target  appears  in  only  one  state  space  bin  at 
each  time  step.  This  requires  that  the  frequency  of  the  signal  be  centered  in  its  Fourier  transform  frequency  bin 
so  as  not  to  leak  out,  and  that  moving  targets  jump  discretely  from  one  spatial  cell  to  the  next  in  between  the  T 
-second  windows.  Granted  that  real  targets  move  continuously  between  spatial  cells,  it  will  become  clear  later 
that  failures  of  this  assumption  are  only  a  minor  problem  for  the  random  walk  approach  introduced  in  this 
appendix.  Of  course,  they  could  be  more  serious  for  other  approaches  that  do  not  average  results  from 
neighboring  spatial  cells.  The  time  interval  T  must  be  short  enough  to  justify  the  assumption  of  a  motionless 
target  in  the  spatial  processing  and  the  assumptions  of  linear  phase  and  constant  amplitude  of  the  signal  during 
each  r-second  coherent  processing  window.  Therefore  the  analysis  applies  to  those  cases  when  an  adaptive 
locally  optimum  first-order  detector  is  used  to  process  the  beamformer  outputs  and  not  to  the  cases  when  an 
adaptive  locally  optimum  second  order  detector  is  used  to  process  the  beamformer  outputs. 

On  the  basis  of  this  no-leakage  assumption,  we  can  now  describe  in  standard  terms  a  detection  problem  for  each 
spatial  cell,  each  frequency  bin,  and  each  T-second  interval.  Assume  the  spectrum  equalized  noise  samples  g(/) 
at  f  =  nTIN  for  n  =  0,  1 ,  have  zero  mean,  unit  variance,  and  independent  Gaussian  distributions.  We 

postulate  a  random  sinusoidal  signal  s{t)  =  arcos(2nft+2nQ),  where  f=  kIT  for  an  integer  k  strictly  between  0 
and  A//2.  the  phase  offset  6  is  uniformly  distributed  on  [0, 1).  The  amplitude  factor  r  is  Rayleigh  distributed  with 
parameter  a  =  1 ,  and  the  factor  a  is  known.  Then  we  measure  z{t)  =  g(t)  or  g(t)  +  s(/)  under  hypothesis  //o  or 
//i ,  respectively. 
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NUMBER  OF  SPATIAL CEU  BOUNDARIES  CROSSED 


Figure  C-1.  Cumulative  distribution  of  cell-boundary  crossings. 
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Modeling  the  phase  offset  and  the  amplitude  as  constant  for  exactly  T  seconds  and  then  changing 
instantaneously  to  new  random  values  for  the  next  T  seconds  is  an  approximation  that  is  analytically  tractable, 
and  it  allows  the  phase  and  amplitude  to  vary  randomly.  This  is  the  weakest  assumption  that  can  be  made  on 
moving  targets  and  has  consequences  that  we  wish  to  investigate. 

In  analyses  like  that  in  Pryor  (1971),  processing  gain  from  incoherent  averaging  is  computed  based  on  an 
assumption  of  constant,  known  signal  power.  As  long  as  the  detector  is  only  thresholding  measured  energy,  this 
common  assumption  is  inconsequential,  but  for  a  tracking  detector,  the  randomness  of  the  signal  power  is 
important. 


Based  on  the  assumptions  introduced  above,  the  likelihood  functions  for  the  measurement  //-vector  z  is 

P(zl//o)  =  (2;c)~2^e~2  2(nTiN)  Rg(,g||jng  (hat  P(0)  =  1  and  P{r)  =  re~^'^ ,  we  find  that 

P(z,  r,  0I//1 )  =  (27C)~^^e'‘^  [z(/i77WHircos(2>yj+2»c0)]  jhg  m3rgjp,ai  density  function  for  z  can  be 

computed  from  these  equations  directly,  with  some  difficulty.  Simplicity  is  gained  by  transforming  variables  as 
follows:  X  =  rcos(27t0),3^  =  rsin(27t0),ricr/_y  =  InrdrdQ.  Then 

where  we  expand  the  exponent  G  as  follows.  Defining  iV-vectors  c*  and  5* ,  whose  «-th  components  are 
cosilnkn/N)  and  sin(27tA7i/A0 ,  respectively:  G  =  jjz -oxc*  -i- -  llzl|^  -i-x^  +y^ 

=  x^ -2axCk  •  z  +  a^x^\\ck\\^ -^-y^  +2aysic  -z  +  a^y^H^ilP 


=(i 

L  I  +<i2||c.l|^  J  1  +a2|!c,||^ 

-  -i2 

L  l+a2|l5*ll  J 

In  this  form  it  is  easy  to  integrate  out  the  x  and  y  dependence  to  obtain  the  likelihood  ratio: 

Recall  that  11^*11^  =  11^*11^=-^  and  that  the  A^-th  coefficient  of  the  discrete  Fourier  transform  of  the  sequence  z  is 
z(A)  =  z  •  (c*  -  w*).  For  notational  simplicity,  let  ^(a,  AO  =  -y  . 

The  above  expression  for  /(z)  can  then  be  written  /(z)  =  •  Since  a  and  N  are  fixed,  we 

see  that  the  likelihood  ratio  is  a  monotone  function  of  the  power  in  the  A-th  frequency  bin. 

Next,  we  determine  the  distribution  of  lz(A)  1  ^ under  Ho  and  Hi ,  and  compute  the  expected  signal-to-noise  ratio 
of  the  detector  inputs.  Under  Hi ,  the  signal  plus  noise  case,  the  measurement  vector  z  =  g  +  axck  -  aysk 
consists  of  N  zero  mean  Gaussian  distributions,  so  any  linear  functional  of  z  has  a  zero-mean  Gaussian 
distribution.  ,n  pariioutar,  -“rlb.ll. 

M  w  ] = '  Mm] = '  M  ■ 


+(i  +d^ii5*ip)  Lv 


ask-z 
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Remembering  that  a  Chi-squared  random  variable  with  2  degrees  of  freedom  is  just  an  exponential  random 
variable  with  mean  2 .  we  conclude  that  ( z(k)  |  ^  is  exponential  with  mean  N{  1  -^)  under  H i .  We  obtain  the 

distribution  for  \z{k)  |  ^  under  //o  by  setting  a  =  0.  This  means  the  presence  of  a  signal  in  \zik)  \  ^  changes  the 
mean  of  its  exponential  distribution  from  to  M 1  +  ■ 


The  signal-to-noise  ratio  (per  frequency  bin,  not  per  hertz),  including  the  processing  gain  due  to  band  pass 
filtering,  is  easy  to  compute.  The  signal  power  is  with  expected  value  .  Letting  g  denote  the  vector  of  N 


noise-only  inputs,  the  expected  signal 


— £| 


I IKII  J 


1 11^*11  J 


power  in  the  Mh  frequency  bin  is 
^  2_ 

N' 


Consequently,  the  ratio  of  expected  signal  power  to  expected  noise  power  is 


Known  Path  Detection  Performance. 


A  detector  based  on  the  Neyman-Pearson  criteria  employs  the  likelihood  ratio  as  a  test  statistic; 

/(z)  =  .  where  a  and  are  assumed  known  a  priori,  or  else  are  accurately  estimated  from 

other  data. 


Now,  to  obtain  a  detector  for  measurements  like  z,  we  set  a  threshold  on  \zik)  |  ^ ,  or  on  /(z) ,  or  on  any  other 
monotonic  function  of  the  likelihood  ratio,  in  order  to  select  the  best  track.  But  because  of  the  tracking  to  be  done 
below,  we  need  to  handle  convex  combinations  of  likelihood  ratios,  and  the  addition  operations  involved  prevent 
our  transforming  the  likelihood  ratios  nonlinearly,  as  we  might  otherwise  do  by  adding  logarithms  and  replacing 
products  of  likelihood  ratios  by  sums  of  "energies.”  This  is  the  reason  why  the  tracking  detector  under  discussion 
should  not  be  thought  of  as  just  "summing  energy  over  time"  for  an  unknown  target  track. 

Assuming  a  known  target  path,  the  likelihood  ratio  for  M  successive  measurements  taken  along  that  path  is 
-  n^i  lj(z) .  The  random  variable  In(XAf)  +  Mln(l  -i-  ■^)  is  then  equivalent  to  ^  times  the  sum  of  M 
exponential  random  variables  with  mean  one,  and  it  has  a  F  distribution  with  parameters  M  and  2! {no}).  By 
reversing  the  transformation,  the  distribution  for  Xm  can  be  derived  from  the  Gamma  distribution. 

We  now  describe  the  Neyman-Pearson  detection  performance  of  our  algorithm  for  the  known  path  case  by 
means  of  these  distributions.  An  important  point  arises  because  our  surveillance  system  does  not  just  process  M 
measurements  and  then  shut  down,  it  continues  with  the  next  M.  If  a  system  designer  doubles  M,  he  not  only 
doubles  the  amount  of  information  available  at  each  detection  opportunity,  but  he  also  halves  the  number  of 
detection  opportunities  per  hour.  Both  effects  are  significant  for  system  performance.  All  of  our  performance 
results  are  formulated  in  terms  of  the  minimum  detectable  signal  level  associated  with  a  given  operating  point 
{Pd,Pfa),  for  various  path  lengths  M.  In  evaluating  such  a  minimum  detectable  signal  level,  is  it  appropriate  to 
try  to  fix  the  probability  of  false  alarm  or  detection  per  detection  opportunity,  when  the  length  of  the  detection 
opportunity  itself  is  being  varied?  A  fair  assessment  might  be  based  on  detections  or  false  alarms  per  hour 
instead  of  per  opportunity. 

To  help  assure  that  equality  of  minimum  detection  signal  thresholds  implies  equal  detection  performance  on 
minimally  detectable  signals,  we  allow  the  operating  point  {Pp{M),  PpA^kf))  to  depend  on  M,  and  attempt  to 
formulate  an  appropriate  equivalence  condition  to  relate  operating  points  chosen  for  different  values  of  M.  Our 
criteria  is  that  nominally  equivalent  operating  points  should  generate  approximately  equal  false-alarm  rates  per 
unit  time.  If,  say,  we  are  comparing  detectors  with  path  lengths  of  A/  =  1 , 2, 4, 8, 1 6  points,  then  it  seems 
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appropriate  to  assume  that  the  1 6/A/  detection  opportunities  occurring  in  an  interval  of  length  1 6  are  combined 
with  an  "if-any"  combiner,  which  declares  a  detection  if  any  one  or  more  of  the  16/ A/  subintervals  declares  a 
detection.  In  this  case,  identical  detection  statistics  for  each  combined  16-point  detector  can  be  guaranteed  by 
requiring  that 


[  1  -  Pz,(A/)]  =  1  -/>/,(  1 6)  and  [  1  -  Pfa (A/)]  =  1  -  ( 1 6) . 


More  generally,  we  require  that  [  1  -  be  invariant  over  M  for  equivalent  operating  points. 

Checking  the  consequences  of  this  rule,  we  see  that  for  small  Pfa  ,  the  false-alarm  rate  per  unit  time  is 


approximately 


PFAiM)  ^  1-[1-Pfx(l)r  Pfxd) 
M  M  1 


just  as  desired. 


As  a  specific  example,  the  following  Po  settings  are  considered  equivalent  under  this  rule; 


M  1 

2 

4 

8 

16 

P^(M)  0.159 

0.293 

0.5 

0.75 

0.875 

Figure  C-2  shows  the  strength  of  the  minimum  detectable  signal  level  in  decibels  per  bin,  i.e.,  lOlogj^Cc),  for  a 
known  path  detector  achieving  Fd(  1  )=  1/2  (or  equivalent),  given  various  values  of  M  and  Pfa  ( 1 )  (or 
equivalent).  From  the  figure  we  conclude  that  for  fixed  the  minimum  detectable  signal  level  in  decibels 

seems  to  be  approximately  a  linear  function  of  log(A/),  and  that  the  level  decreases  about  0.5  to  1  dB  per 
doubling  of  M.  A  similar  graph,  as  shown  in  figure  C-3,  based  on  Pd{^)  -  1/2  and  various  Pfa{^)  .  looks  almost 
identical,  except  for  the  level  being  shifted  4.2  dB  lower. 

Figure  C-4,  by  contrast,  plots  the  same  minimum  detectable  signal  levels  against  Pf,<(A/)  for  a  fixed  Po(A/) . 

This  shows  the  effect  of  increasing  the  path  length  A/,  without  compensating  for  the  change  in  the  number  of 
detection  opportunities  per  hour.  As  expected,  the  curves  are  more  widely  spaced  than  in  figures  C2  and  C-3, 
illustrating  the  benefits  expected  from  increasing  A/  in  a  system  that  is  shut  down  after  the  first  and  only  detection 
opportunity. 

Intuition  suggests  that  the  minimum  detectable  level  of  the  tracking  detector  might  be  improved  to  an  arbitrary 
level  by  increasing  M.  This  turns  out  to  be  false,  understood  in  terms  of  minimum  detectable  signal  level  for  fixed 
(/’/)( l),P/v<(l))-  An  approximate  limit  as  M  increases  without  bound  can  be  obtained  by  means  of  an 
asymptotic  expansion  (Prof.  T.  Sellke,  Stanford,  private  communication); 

lim  Cat  +  1  =  — z - where  Cm  is  the  minimum  detectable  level.  The  limit  is  in  terms  of  the  Lambert 

function  Wix) ,  which  is  defined  by  x  =  for  x  >  -1/e.  The  limit  found  in  this  manner  is  plotted  as  the 

”M  large"  curve  In  Figure  C-  2,  where  "large"  is  interpreted  to  mean  M  >  HP  fa-  This  curve  has  a  horizontal 
asymptote  of  5.2  dB  for  small  Pfa  ■  It  is  not  known  whether  this  limit  is  approached  monotonically  nor  that  the 
limit  is  a  bound  for  the  actual  performance,  although,  this  is  probably  the  case.  If  this  were  the  case,  even  given 
perfect  knowledge  of  the  target  path  there  would  be  a  limit  to  the  performance  gain  obtainable  by  combining 
measurements  over  time. 

Figure  C-5  displays  the  thresholds  on  a  Gamma  distribution  for  Pd{  1 )  =  0.9  and  Pfa{1  )  =  0.5  and  various  M. 
The  curves  appear  linear.  If  they  were  linear  (and  they  are  not),  then  the  limiting  MDS  for  large  M  would  be  A 
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toglO(PFA(1)) 


Figure  C-2.  Known  path  detector  performance  for  PD(1)  =  O.SO. 


k>glO(PFA(4)) 

Figure  C-3.  Known  path  detector  performance  for  PD(4)  =  0.50. 
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the  ratio  of  slopes  minus  one.  Unfortunately,  poor  numerical  conditioning  prevents  the  computation  of  these 
curves  for  small  Pfa  or  large  M. 

Motion  Models  and  Combining  Likelihood  Ratios  over  Spatial  Ceils. 

Most  target  motion  models  treat  velocity  as  approximately  constant,  estimating  2d  parameters  in  a  r/-dimensional 
space,  for  position  and  velocity.  Normally,  either  highly  accurate  measurements  or  some  idea  of  target  mission, 
plan,  or  tactics  are  required  to  estimate  acceleration.  Acceleration  estimation  may  be  appropriate  in  some  aircraft 
and  missile  applications,  but  for  undersea  surveillance  applications  it  is  not  considered  necessary. 

Wei's  and  Zeidler's  (1992,1993)  bounded  hop  motion  model  is  an  example  of  a  model  estimating  only  d 
parameters.  Target  velocity  is  assumed  bounded  by  an  a  priori  maximum  speed,  but  is  otherwise  free  to  change 
arbitrarily  from  time  step  to  time  step.  This  approach  was  previously  investigated  under  the  HGI  project,  and 
concerns  arose  about  whether  the  performance  was  relevant,  mainly  because  it  did  not  seem  to  take  any 
advantage  of  the  consistency  in  target  velocity  that  one  would  expect  to  see  for  submarines.  Accumulation  of 
likelihood  ratio  statistics  under  the  assumption  of  known  target  velocity  requires  forming  the  product  of  the 
likelihood  ratios  of  the  measurements  at  each  time  step  at  the  predicted  position  of  the  target.  The  following 
recursive  formula  computes  the  accumulated  likelihood  ratfo  at  time  step  M  +  1  and  position  Xa/+i  ,  from  the 
individual  likelihood  ratios  for  the  measurement  at  xm+i  and  the  accumulated  likelihood  ratio  from  the  previous 
time  step  and  the  previous  position:  Z,a/+i  (xu+i )  =  (xm+i  )Lm(xm)  ■  the  velocity  is  not  known,  but  only 
bounded  by  B  spatial  cells  per  time  step,  then  Wei's  and  Zeidler's  algorithm  accumulates  a  pixel  statistic,  (which 
does  not  seem  to  be  a  likelihood  ratio  for  A/  >  1 )  by  the  following  formula:  [Lm+\  (xat+i  )  = 

Im+i  (jcat+i  )  max  \<b  ^m(xm)  ■  Only  one  likelihood  ratio  is  maintained  for  each  spatial  cell  at  each  time 

step.  Performance  results  quoted  below  include  results  from  Wei's  algorithm,  which  are  treated  as  a  baseline  for 
comparison. 

Random  Walk  Paths. 

A  likelihood  ratio  statistic  with  better  detection  performance  than  Wei's  pixel  statistic  can  be  based  on  a  target 
motion  model  of  a  random  walk  on  a  rf -dimensional  grid,  with  independent  identically  distributed  increments.  The 
extra  information  used  is  consistency  of  target  velocity  over  time.  A  random  walk  modeling  ignorance  of  target 
velocity  models  the  velocity  by  a  distribution  with  mean  zero  and  substantial  variance;  a  random  walk  modeling 
fairly  good  knowledge  of  velocity  models  the  velocity  by  a  distribution  with  nonzero  mean  and  small  variance.  The 
requirement  that  the  target  velocity  distribution  be  discrete  and  not  vary  from  time  step  to  time  step  imposes  a 
lower  bound  on  the  variance  of  the  average  velocity,  but  this  bound  decreases  like  1/A/  and  is  not  onerous. 

Let  the  state  space  cells  be  points  on  a  grid  in  a  discrete  vector  space,  xm^  Z‘^  .  Consider  the  random 
generation  of  the  target  path  to  occurring  two  stages:  first,  a  starting  point  and  a  random  walk  are  chosen,  the 
walk  only  roughly  determining  the  average  velocity  of  the  target,  and  second,  an  instance  of  that  random  walk  is 
chosen  to  generate  the  exact  path  taken  by  the  target.  Consider  a  random  walk,  W,  starting  at  Xi ,  with 
increments,  or  steps,  with  density  /’(Ax) ,  which  does  not  vary  over  time  or  space.  The  initial  segments  of  the 
walk  are  random  sequences  of  points  written  as  wm  =  {x\,X2,--  •, atw) ,  and  the  expected  average  velocity  of  any 
segment  of  the  walk  is  E{Ax)/T. 

We  accumulate  measurement  likelihood  ratios  for  a  walk  ending  at  Xm+i  with  a  formula  of  the  form 
/-M+i (xw+i ,  W)  =  /m+1  (xm+1  )  T-xu  P{XfA\xM+\ )Lm{xm,  W) .  We  exclude  the  possibility  of  two  targets  with 
crossing  paths,  i.e.,  we  assume  that  all  the  alternatives  are  exhausted  by  specifying  hypothesis  Ho  to  mean  that 
no  targets  exist  along  any  path  being  considered  and  hypothesis  Hi  (xm)  to  mean  that  exactly  one  target  exists. 
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and  it  is  traveling  along  a  path  through  xm  (at  the  time  considered).  Let  denote  the  measurement  vector 

at  the  spatial  cell  Xm  taken  in  the  time  window  ending  at  time  Tm.  Then  the  likelihood  ratio  Im  obtained  from  a 
single  measurement  is 


Let  Xm  denote  either  the  point,  or  the  event,  H\  {xm)  ■  The  event  wm  is  the  conjunction  of  points,  or  events, 

H\  (xi ),  Hi (X2),  •  •  •, //i  (xm) ■  The  accumulated  likelihood  ratio  is  the  sum  over  all  paths  which  are  instances  of 
W: 


Lm^I  ^  = 


(XAf+l ), Z^iXM),  ■  •  SZ' (JTI ),  Wa/IxaT+I  j 


Expand  and  rearrange  the  numerator  of  this  expression  to  obtain  a  recursion  relation; 
X (^A#fl ),  Z*^(jCAr),  •  • -,2*  (JTI ),  WA/IxAfU  1 

=  (Xm+1  )  ljCAf+l  j  Z  p(z^(xm),  •  •  sZ*  (JCI ),  Wa/Ixa/+1  j 

XP^ZA/(XA/)lxArj  Z  /’(z^~'(:>CAr-i),---,z‘(xi),WAf-ilxArj. 


M  /  \ 

Divide  the  equations  above  by  the  denominator  of  La#+i  (xat+i  ,  W) ,  n/^  z'"(Xnt)l//o  1 .  and  substitute  in  the 

m=l  ^  ' 


transition  probability  P\ 


^Xa/Ixa/+i  j  = 


PiAx  =  XM+i  -Xm)  defined  by  the  random  walk  to  obtain 


Lm+1  (xat+i  ,  W)  =  Im+i  (xat+i  )  Z  Pi^  =  Xai+i  —Xm)Pm{xm,  ffO  ■ 


The  initial  values  that  start  the  recursion  are  Lo(xo,  ffO=l  for  all  Xo  and  fV.  In  a  practical  implementation,  one 
would  set  P(Ax)  =  0  for  all  but  two  or  three  increment  values  Ax,  so  that  the  sum  over  xm  would  involve  only  two 
or  three  points.  But,  the  presence  of  addition  operations  complicates  any  attempt  to  manipulate  log  likelihoods 
instead  of  likelihoods. 


One  can  visualize  the  relative  contribution  of  the  likelihood  ratios  obtained  over  time  from  neighboring  spatial  cells 
by  plotting  the  probability  of  the  random  walk  passing  through  each  of  those  cells.  Figure  C-6  is  a  surface  plot  of 
the  probability  at  each  time  step,  1  through  15,  of  a  random  walk  that  ends  up  in  cell  0  at  time  16  and  that  has  an 
expected  velocity  of  minus  one  cell  per  time  step.  The  points  in  the  past  history  that  contribute  part  of  their 
likelihood  to  Li6(xi6,  f^-1.0)  are  arranged  in  a  narrow  fan  shape,  the  middle  of  the  fan  points  in  the  direction  of 
the  expected  average  velocity  of  the  random  walk.  The  amount  each  point  contributes  to  the  accumulated 
likelihood  ratio  is  greatest  near  the  middle  of  the  fan.  This  averaging  of  nearby  likelihoods  is  the  reason  little 
difficulty  occurs  with  our  random  walk  algorithm  when  signal  energy  leaks  out  of  the  spatial  cell  containing  the 
true  target  position. 
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WEIGHT 


Serious  attempts  at  deriving  a  probability  distribution  for  Lm^\{xmh  ,  W) ,  even  under  the  simplest  conditions, 
yielded  unworkably  complex  formulas,  so  results  from  the  unknown  path  case  are  obtained  from  Monte  Carlo 
simulations.  To  conduct  Monte  Carlo  simulations,  we  create  an  array  of  likelihoods,  one  for  each  spatial  cell  and 
each  random  walk  going  through  that  cell,  and  update  the  array  recursively  at  each  time  step,  by  applying 
equation  1 .  In  order  to  evaluate  the  likelihood  of  a  target  being  in  the  spatial  cell,  regardless  of  how  it  got  there, 
we  compute  a  weighted  sum  of  all  the  likelihoods  for  that  spatial  cell,  the  weights  being  the  prior  probability  of 
each  of  the  randomwalks  going  through  that  cell.  The  increments  of  each  random  walk  have  a  density  structured 
to  produce  the  desired  expected  velocity  and  to  satisfy  continuity  conditions.  The  density  on  the  average  velocity 
of  the  random  walk  is  trinomial,  but  has  a  bell-shape.  A  prior  Gaussian  density  on  target  velocity  is  approximated 
by  a  mixture  of  a  small  collection  of  these  multinomials  with  the  mixture  weights  the  prior  probability  of  the 
corresponding  random  walk.  For  comparisons  with  the  random  walk  algorithm,  the  hop-bound  B  for  Wei's 
algorithm  is  set  relative  to  the  prior  target  velocity  distribution  at  the  2a  point. 

Signal  Gain  Degradation  for  Unknown  Path. 

Figures  C-7a,  b,  c,  and  d  show  the  increase  in  minimum  detectable  signal  level  caused  by  lack  of  knowledge  of 
the  target  path.  This  is  computed  from  the  difference  in  levels  required  in  Monte  Carlo  simulations  of  tracking 
detection  minus  the  analytically  defined  minimum  detectable  signal  level  for  the  known  path  case.  Each  figure 
shows  random  walk  and  Wei's  algorithm  results  for  A/  =  1 , 2, 4, 8, 1 6 .  The  Pd  is  fixed  at  Pz)(4)  =  1/2  and 
Iog(/’^^(4))  is  plotted  on  the  abscissa,  while  level  differences  in  decibels  are  plotted  on  the  ordinate.  Figures 
C-7a,  b,  c,  and  d  differ  from  each  other  in  the  size  of  the  variance  of  the  prior  velocity  distribution.  This 
determines  the  number  of  neighboring  spatial  cells  a  target  might  travel  into  during  one  time  step  of  T  seconds. 
The  figures  C-7a,  b,  c,  and  d  present  results  for  standard  deviations  of  0.5, 1 ,  2,  and  4  cells  per  time  step.  The 
results  look  similar.  The  level  differences  depend  weakly  on  PfA^ ,  and  increase  by  about  0.5  dB  per  doubling  of 
M,  for  the  random  walk  algorithm,  and  somewhat  more  for  Wei’s  algorithm.  Figures  C-8  and  C-9  illustrate  the 
degradation  caused  by  poor  estimates  of  the  prior  velocity  variance,  which  err  by  a  factor  of  4  and  1/4, 
respectively. 

Signal  Gain  from  Incoherent  Integration. 

Figures  C-lOa,  b,  c,  and  d  illustrate  the  signal  loss,  or  gain,  i.e.  the  change  in  minimum  detectable  signal  level, 
caused  for  an  M  greater  than  one.  This  was  computed  from  the  difference  in  level  required  in  Monte  Carlo 
simulations  of  tracking  detection  for  various  M  less  the  analytically  obtainable  level  for  the  case  M=\.  Each 
figure  presents  random  walk  and  Wei's  algorithm  results  for  A/ =  1,2,4,8,16.  TheP/j  is  fixed  at  (4)  =  1/2 
and  P/v<(4)  is  plotted  on  the  abscissa,  while  leveldifferences  in  decibels  are  plotted  on  the  ordinate.  The  results 
presented  in  figures  C-lOa,  b.  c,  and  d  are  for  different  variances  of  the  prior  velocity  distribution,  as  for  figures 
C-7a.  b,  c,  and  d.  The  crossing  of  the  curves  at  Pfx(4)  =  10"^-^  for  the  random  walk,  and  at  Pfa{^)  =  10“^-^ 
for  Wei's  algorithm,  show  that  weak  signals  (supporting  only  large  Pfa ’s)  cause  trouble  for  tracking  detectors, 
and  that  for  the  weakest  signals  setting  M  =  1  appears  optimal.  The  break  even  point  being  higher  for  the 
random  walk  algorithm  is  a  sign  that  it  is  performing  more  strongly  than  Wei's  algorithm. 

A  more  useful  description  of  the  regime  of  operation  in  which  tracking  detection  profits  for  A/  >  1  can  be 
displayed  in  terms  of  the  increase,  or  decrease,  of  Pfa{^)  caused  by  choosing  a  path  length  of  M  instead  of 
one,  given  fixed  P£)(4)  and  various  minimum  detectable  signal  levels  is  illustrated  in  figures  C-lla,  b,  c,  and  d. 
Observe  that  the  break-even  points  are  near  3.5  dB  for  the  random  walk  algorithm,  and  6  dB  for  Wei's  algorithm. 
These  break-even  points  bound  the  range  signal-to-noise  ratios  for  which  tracking  detection  offers  a  potential 
benefit. 
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Figure  C-7.  Dependence  of  detector  on  prior  velocity  variance. 
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Figure  C-7.  Dependence  of  detector  on  prior  velocity  variance. 
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Dependence  of  detector  performanc .  on  low  estimate  of  prior  velocity  variance 
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Figure  C-9.  Dependence  of  detector  performance  on  high  estimate  of  prior  velocity  variance. 
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Figure  C-10.  Detector  performance  for  multiple  spatial  cells  relative  to  performance  for  a  single  spatial  cell. 
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Figure  C- 1 0.  Detector  performance  for  multiple  spatial  cells  relative  to  performance  for  a  single  spatial  cell. 
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Figure  C-11.  Dependence  of  detector  performance  on  path  length. 
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Figure  C-11.  Dependence  of  detector  performance  on  path  length. 
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High  Po  Regime. 


The  question  arises  whether  the  price  of  not  knowing  the  target  velocity  is  significant  for  strong  signals,  or  high 
values  of  Pp .  since  the  track  ought  to  be  evident  from  the  data  in  such  cases.  If  so,  the  tracking  detector 
performance  should  approach  that  of  the  known-path  case,  given  strong  enough  signals.  Figure  C-12  illustrates 
the  change  in  minimum  detectable  signal  levels  from  the  known  path  case  to  the  tracking  detector  case,  for 
Pd{^)  =  0.95  and  various  Pfa  ■  The  difference  is  not  zero.  The  increase  in  levels  for  the  random  walk  detector 
might  be  attributable  to  the  fact  that  likelihood  ratios  for  neighboring  spatial  cells  are  averaged  together  whether 
the  signal  is  strong  or  weak,  so  that  some  noise-only  measurements  contribute  to  the  detector.  This  seems 
unavoidable  for  the  present  algorithm  implementation,  because  the  variance  of  the  velocity  of  the  random  walk 
has  a  lower  bound  imposed  by  the  discretization  of  target  motion. 

Summary. 

The  processing  load  increases  with  increasing  M  depending  on  the  specific  tracking  algorithm.  For  Wei's 
algorithm,  the  processing  is  proportional  to  M.  For  the  random  walk  algorithm,  there  is  an  additional  processing 
load,  proportional  to  .  which  depends  on  the  number  of  random  walks  required  to  model  possible  target 
velocities.  Chen  (1989)  examines  the  signal  loss  caused  by  finite  discretization  of  the  velocity  in  a  probability 
map  tracking  detector.  He  shows  that  for  c/  =  2  the  number  of  velocity  cells  required  to  meet  a  loss  constraint 
increases  as  A/* ,  which  corresponds  to  our  A/*^^  rate. 

The  minimum  detectable  signal  level  for  the  known  path  case  seems  bounded  as  A/  — >  <» ,  so  that  tracking 
detectors  can  gain  only  a  bounded  advantage  over  an  if-any  combiner,  but  the  bound  may  be  large.  In  any  case, 
the  loss  in  level  caused  by  doubling  M,  which  is  weakly  dependent  on  the  operating  point,  is  a  little  less  than  1  dB 
per  doubling. 

The  signal  loss  caused  by  lack  of  knowledge  of  the  target  path  is  approximately  y  logjCA^  dB.  This  loss  is  also 
weakly  dependent  on  the  operating  point.  The  break  even  point  in  signal-to-noise  ratio,  below  which  a  tracking 
detector  offers  no  benefit  for  detection,  depends  on  the  detection  algorithm  employed.  Observe  that  for  too  weak 
a  signal  the  signal  cannot  be  identified  among  noise  peaks  that  line  up  by  coincidence. 
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Figure  C-12.  Dependence  of  detector  performance  on  knowledge  of  track. 
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